Feature: Identity Gate — Enrich inbound messages with sender trust context #92
Labels
No labels
Compat/Breaking
Kind/Bug
Kind/Competitor
Kind/Documentation
Kind/Enhancement
Kind/Epic
Kind/Feature
Kind/Security
Kind/Story
Kind/Testing
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Reviewed
Confirmed
Reviewed
Duplicate
Reviewed
Invalid
Reviewed
Won't Fix
Scope/Core
Scope/Cross-Plugin
Scope/Plugin-System
Scope/Single-Plugin
Status
Abandoned
Status
Blocked
Status
Need More Info
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
ultanio/cobot#92
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Background: How OpenClaw Inbound Metadata Works Today
OpenClaw already has a two-layer metadata injection system for every inbound message:
Layer 1: System Prompt —
buildInboundMetaSystemPrompt()(trusted)Injected into the system prompt as
## Inbound Context (trusted metadata). Contains:This is marked as authoritative — the agent is told to treat it as ground truth.
Layer 2: User Message Prefix —
buildInboundUserContextPrefix()(untrusted)Prepended to the user message as
Conversation info (untrusted metadata). Contains:message_id,sender_id,sender(uid/username/e164)conversation_label,group_subjectwas_mentionedname,username,tag,e164All marked as (untrusted) — the agent knows this comes from the messaging platform and could theoretically be spoofed.
Key Design Decisions
agent:bootstrap,command:new,command:stop,message:received. Themessage:receivedhook fires with full sender metadata (senderId, senderName, senderUsername, provider, surface).agent:bootstraphook can modify bootstrap files before they're injected. This pattern could be extended.The Gap
The agent receives raw identifiers but has no enriched trust context. It doesn't know:
Currently, agents handle this ad-hoc via memory files (unreliable) or not at all.
Proposal: Identity Gate Hook
A new hook (or extension point in Cobot) that enriches the inbound metadata with trust context before it reaches the agent:
Where to inject
The trust context should go into the system prompt (trusted layer), not the user message prefix. The agent must not be able to be convinced by prompt injection to ignore its trust assessment.
Entity DB Schema (SQLite)
Trust Levels
operatoragentteamcommunityunknownImplementation Options
Option A: OpenClaw Hook — A new internal hook (
message:enrichoridentity:resolve) that fires aftermessage:receivedand can modify the system prompt context. Requires OpenClaw core changes.Option B: Cobot Extension Point — Cobot-level middleware that wraps OpenClaw's message dispatch. More framework-specific but doesn't need upstream changes.
Option C: Pre-processing Proxy — External process that intercepts messages before OpenClaw sees them. Most isolated but adds infrastructure.
Why Not a Skill?
Skills are opt-in and depend on agent memory/compliance. An agent might forget to use the skill, or be convinced via prompt injection to skip the trust check. The identity gate must be mandatory and system-enforced, like the existing inbound metadata injection.
Relation to Other Issues
Real-World Test Case
On 2026-02-25, an unknown user (Franky) asked Hermes to pay a Lightning invoice in the Cobot Guests Telegram group. Hermes correctly refused based on ad-hoc reasoning (unknown sender + financial action). With the identity gate, this would be a system-enforced denial rather than depending on agent judgment.
References
src/auto-reply/reply/inbound-meta.ts— current metadata injectionsrc/agents/bootstrap-hooks.ts— existing hook patterngateway:startup,agent:bootstrap,command:*,message:receivedTriage Assessment
Classification: VALID-ENHANCEMENT
Analysis:
Excellent, deeply researched proposal from Hermes. This is arguably the most architecturally significant issue in the backlog — it addresses a fundamental gap between "who is talking" and "what can they do."
Key observations:
Concerns:
Suggested next steps:
Label added: Kind/Feature
Priority: Flagged for human — suggest Priority/High given security implications
Note: #50 and #51 appear to be duplicates (same title: "Secrets are exposed to plugins..."). Flagging for sweep.
Triaged by Doxios 🦊