ultanio/cobot

Fork 4

[#158] feat: trust context plugin — trusted/untrusted message distinction #159

Merged

k9ert merged 2 commits from feat/trust-plugin into main

2026-02-28 00:14:39 +00:00

doxios commented

2026-02-27 21:27:52 +00:00

Collaborator

Closes #158

What

Pure plugin that introduces trust boundaries to LLM conversations using the message role field (system = trusted, user = untrusted).

How

Hook	Purpose
`loop.transform_system_prompt`	Appends trust model instructions + anti-injection preamble
`loop.on_message`	Captures sender/channel metadata
`loop.transform_history`	Injects trusted context as `role: system` at index 1

Message Flow

Before:  [system: soul] → [user: message]
After:   [system: soul + trust instructions] → [system: trusted context] → [user: message]

Design Decisions

Priority 16 (after soul 15, before context 18) — per plugin-reviewer feedback
Graceful fallback — skips trusted context injection when metadata is missing
Resets per-message — metadata cleared after injection to avoid stale context
Configurable — include_sender/channel/timestamp flags + custom preamble
Zero core changes — remove plugin to revert

Tests

21 tests covering:

Plugin meta and factory
System prompt transformation (with/without custom preamble)
Metadata capture (full, partial, missing)
Trusted context injection (index position, multi-turn, skip conditions, reset)
Context string building (config flags, channel formatting)

Reviewed by plugin-reviewer sub-agent (NEEDS-REVISION → addressed all feedback).

Closes #158 ## What Pure plugin that introduces trust boundaries to LLM conversations using the message role field (`system` = trusted, `user` = untrusted). ## How | Hook | Purpose | |------|--------| | `loop.transform_system_prompt` | Appends trust model instructions + anti-injection preamble | | `loop.on_message` | Captures sender/channel metadata | | `loop.transform_history` | Injects trusted context as `role: system` at index 1 | ## Message Flow ``` Before: [system: soul] → [user: message] After: [system: soul + trust instructions] → [system: trusted context] → [user: message] ``` ## Design Decisions - **Priority 16** (after soul 15, before context 18) — per plugin-reviewer feedback - **Graceful fallback** — skips trusted context injection when metadata is missing - **Resets per-message** — metadata cleared after injection to avoid stale context - **Configurable** — include_sender/channel/timestamp flags + custom preamble - **Zero core changes** — remove plugin to revert ## Tests 21 tests covering: - Plugin meta and factory - System prompt transformation (with/without custom preamble) - Metadata capture (full, partial, missing) - Trusted context injection (index position, multi-turn, skip conditions, reset) - Context string building (config flags, channel formatting) Reviewed by plugin-reviewer sub-agent (NEEDS-REVISION → addressed all feedback).

doxios added 1 commit

2026-02-27 21:27:52 +00:00

[#158 ] feat: add trust context plugin — trusted/untrusted message distinction 34bba0f902

Pure plugin that introduces trust boundaries to LLM conversations:
- Appends trust model instructions to system prompt
- Captures sender/channel metadata from incoming messages
- Injects trusted context as role:system message before user messages

Remove plugin to revert to baseline. No core changes needed.

Priority 16 (after soul 15, before context 18).
21 tests covering all hooks and edge cases.

doxios force-pushed feat/trust-plugin from 34bba0f902 to 17bd085411

2026-02-27 21:31:29 +00:00

Compare

doxios reviewed

2026-02-27 21:37:26 +00:00

doxios left a comment

Author

Collaborator

PR Review - Trust Context Plugin

Summary

✅ Story: #158 - Trust Context Plugin
❌ Status: REQUEST_CHANGES (Critical extension point bug)

AC Compliance

Criteria	Status	Notes
✅ BasePlugin pattern	PASS	Proper inheritance, meta, lifecycle methods
❌ Extension points only	FAIL	Signature mismatch breaks functionality
✅ Removable	PASS	No core changes, clean disable
❌ Trusted context injection	FAIL	Always gets "unknown" sender
✅ Anti-injection instructions	PASS	TRUST_PREAMBLE correctly added

🚨 Critical Issue - Extension Point Mismatch

File: cobot/plugins/trust/plugin.py:82

Problem: Trust plugin expects "peer" but loop passes "sender"

# Loop plugin (line 256)
"sender": msg.sender_name,  # <-- actual data

# Trust plugin (line 82)  
sender = ctx.get("peer", "unknown")  # <-- wrong field

Impact: Trust context will always show sender as "unknown", defeating the purpose.

Fix: Change line 82 to:

sender = ctx.get("sender", "unknown")

Extension Point Verification

❌ loop.on_message: Field name mismatch (critical)
✅ loop.transform_system_prompt: Correct ctx.get("prompt")
✅ loop.transform_history: Correct ctx.get("messages")

Code Quality

✅ Strengths

Clean plugin architecture
Comprehensive docstrings & type hints
Good error handling with graceful fallbacks
Metadata reset prevents stale data

⚠️ Concerns

Thread safety: self._current_metadata instance variable in concurrent environment
Test coverage gap: Tests mock the wrong interface (use "peer" vs actual "sender")

Test Results

21 passed in 0.03s ✅

But: Tests use mocked data that doesn't match the real loop interface, masking the bug.

Required Changes

🔴 Must Fix (Blocking)

plugin.py:82 - Fix field name: "peer" → "sender"
test_plugin.py:187 - Update test data to match real interface

🟡 Should Fix

Add integration test with actual loop signatures
Consider thread-local storage for metadata to improve thread safety

Overall Assessment

Excellent plugin design and implementation, but a critical integration bug prevents it from working. The mismatch between expected and actual extension point signatures is a blocking issue.

Once the field name is fixed, this will be a solid addition that properly implements trusted context boundaries using LLM message roles.

Priority: High (addresses security concern #158)
Architecture: Sound (pure extension points, no core changes)
Testing: Comprehensive but needs integration coverage

# PR Review - Trust Context Plugin ## Summary ✅ **Story:** #158 - Trust Context Plugin ❌ **Status:** REQUEST_CHANGES (Critical extension point bug) ## AC Compliance | Criteria | Status | Notes | |---|---|---| | ✅ BasePlugin pattern | PASS | Proper inheritance, meta, lifecycle methods | | ❌ Extension points only | **FAIL** | Signature mismatch breaks functionality | | ✅ Removable | PASS | No core changes, clean disable | | ❌ Trusted context injection | **FAIL** | Always gets "unknown" sender | | ✅ Anti-injection instructions | PASS | TRUST_PREAMBLE correctly added | ## 🚨 Critical Issue - Extension Point Mismatch **File:** `cobot/plugins/trust/plugin.py:82` **Problem:** Trust plugin expects `"peer"` but loop passes `"sender"` ```python # Loop plugin (line 256) "sender": msg.sender_name, # <-- actual data # Trust plugin (line 82) sender = ctx.get("peer", "unknown") # <-- wrong field ``` **Impact:** Trust context will always show sender as "unknown", defeating the purpose. **Fix:** Change line 82 to: ```python sender = ctx.get("sender", "unknown") ``` ## Extension Point Verification - ❌ `loop.on_message`: Field name mismatch (critical) - ✅ `loop.transform_system_prompt`: Correct `ctx.get("prompt")` - ✅ `loop.transform_history`: Correct `ctx.get("messages")` ## Code Quality ### ✅ Strengths - Clean plugin architecture - Comprehensive docstrings & type hints - Good error handling with graceful fallbacks - Metadata reset prevents stale data ### ⚠️ Concerns - **Thread safety**: `self._current_metadata` instance variable in concurrent environment - **Test coverage gap**: Tests mock the wrong interface (use `"peer"` vs actual `"sender"`) ## Test Results ``` 21 passed in 0.03s ✅ ``` **But:** Tests use mocked data that doesn't match the real loop interface, masking the bug. ## Required Changes ### 🔴 Must Fix (Blocking) 1. **plugin.py:82** - Fix field name: `"peer"` → `"sender"` 2. **test_plugin.py:187** - Update test data to match real interface ### 🟡 Should Fix 3. Add integration test with actual loop signatures 4. Consider thread-local storage for metadata to improve thread safety ## Overall Assessment Excellent plugin design and implementation, but a critical integration bug prevents it from working. The mismatch between expected and actual extension point signatures is a blocking issue. Once the field name is fixed, this will be a solid addition that properly implements trusted context boundaries using LLM message roles. **Priority:** High (addresses security concern #158) **Architecture:** Sound (pure extension points, no core changes) **Testing:** Comprehensive but needs integration coverage

doxios added 1 commit

2026-02-27 21:40:03 +00:00

fix: use correct extension point field names (sender, not peer) 4f64c397a6

loop.on_message passes 'sender' not 'peer'. Updated capture_metadata
and tests to match the actual loop.py interface.

k9ert force-pushed feat/trust-plugin from 4f64c397a6 to 7a4e33ba08

CI / lint (pull_request) Successful in 5s

Details

CI / test (pull_request) Failing after 12s

Details

2026-02-27 23:26:01 +00:00

Compare

k9ert force-pushed feat/trust-plugin from 7a4e33ba08

CI / lint (pull_request) Successful in 5s

Details

CI / test (pull_request) Failing after 12s

Details

to b39bf60745

CI / lint (pull_request) Successful in 8s

Details

CI / test (3.11) (pull_request) Successful in 17s

Details

CI / test (3.12) (pull_request) Successful in 20s

Details

CI / test (3.13) (pull_request) Successful in 19s

Details

E2E Tests / e2e (pull_request) Successful in 13s

Details

CI / build (pull_request) Successful in 6s

Details

2026-02-27 23:45:46 +00:00

Compare

doxios reviewed

2026-02-28 00:13:20 +00:00

doxios left a comment

Author

Collaborator

Second Round Review - APPROVE ✅

Previous Issues RESOLVED:

✅ Fixed field name mismatch: The plugin now correctly uses ctx.get("sender") in capture_metadata to match what loop.on_message actually passes (verified in loop.py:255)
✅ Fixed test field names: Tests now use "sender" field consistently instead of "peer"

Verification Results:

🔍 Extension point field validation:

loop.on_message passes: sender, sender_id, channel_type, channel_id, event_id
loop.transform_system_prompt passes: prompt, peer, messages
loop.transform_history passes: messages, peer
✅ Plugin correctly uses sender for the on_message hook

🧪 Tests: All 21 tests pass

📋 Code Quality:

Clean implementation with proper error handling
Graceful fallback when metadata is missing
Configurable behavior via flags
Good documentation and type hints
Follows plugin architecture correctly

📖 Documentation: README.md is comprehensive and accurate

Minor Observation (non-blocking): The plugin uses instance variable _current_metadata to pass data between hooks, which works correctly in the current sequential message processing but could theoretically be a concern if concurrent processing were added in the future. However, this is not an issue with the current implementation.

Verdict: APPROVE - All previous REQUEST_CHANGES items have been resolved. The implementation is solid and follows Cobot plugin patterns correctly.

## Second Round Review - APPROVE ✅ **Previous Issues RESOLVED:** ✅ **Fixed field name mismatch**: The plugin now correctly uses `ctx.get("sender")` in `capture_metadata` to match what `loop.on_message` actually passes (verified in `loop.py:255`) ✅ **Fixed test field names**: Tests now use `"sender"` field consistently instead of `"peer"` **Verification Results:** 🔍 **Extension point field validation**: - `loop.on_message` passes: `sender`, `sender_id`, `channel_type`, `channel_id`, `event_id` - `loop.transform_system_prompt` passes: `prompt`, `peer`, `messages` - `loop.transform_history` passes: `messages`, `peer` - ✅ Plugin correctly uses `sender` for the `on_message` hook 🧪 **Tests**: All 21 tests pass 📋 **Code Quality**: - Clean implementation with proper error handling - Graceful fallback when metadata is missing - Configurable behavior via flags - Good documentation and type hints - Follows plugin architecture correctly 📖 **Documentation**: README.md is comprehensive and accurate **Minor Observation** (non-blocking): The plugin uses instance variable `_current_metadata` to pass data between hooks, which works correctly in the current sequential message processing but could theoretically be a concern if concurrent processing were added in the future. However, this is not an issue with the current implementation. **Verdict: APPROVE** - All previous REQUEST_CHANGES items have been resolved. The implementation is solid and follows Cobot plugin patterns correctly.