[#158] feat: trust context plugin — trusted/untrusted message distinction #159

Merged
k9ert merged 2 commits from feat/trust-plugin into main 2026-02-28 00:14:39 +00:00
Collaborator

Closes #158

What

Pure plugin that introduces trust boundaries to LLM conversations using the message role field (system = trusted, user = untrusted).

How

Hook Purpose
loop.transform_system_prompt Appends trust model instructions + anti-injection preamble
loop.on_message Captures sender/channel metadata
loop.transform_history Injects trusted context as role: system at index 1

Message Flow

Before:  [system: soul] → [user: message]
After:   [system: soul + trust instructions] → [system: trusted context] → [user: message]

Design Decisions

  • Priority 16 (after soul 15, before context 18) — per plugin-reviewer feedback
  • Graceful fallback — skips trusted context injection when metadata is missing
  • Resets per-message — metadata cleared after injection to avoid stale context
  • Configurable — include_sender/channel/timestamp flags + custom preamble
  • Zero core changes — remove plugin to revert

Tests

21 tests covering:

  • Plugin meta and factory
  • System prompt transformation (with/without custom preamble)
  • Metadata capture (full, partial, missing)
  • Trusted context injection (index position, multi-turn, skip conditions, reset)
  • Context string building (config flags, channel formatting)

Reviewed by plugin-reviewer sub-agent (NEEDS-REVISION → addressed all feedback).

Closes #158 ## What Pure plugin that introduces trust boundaries to LLM conversations using the message role field (`system` = trusted, `user` = untrusted). ## How | Hook | Purpose | |------|--------| | `loop.transform_system_prompt` | Appends trust model instructions + anti-injection preamble | | `loop.on_message` | Captures sender/channel metadata | | `loop.transform_history` | Injects trusted context as `role: system` at index 1 | ## Message Flow ``` Before: [system: soul] → [user: message] After: [system: soul + trust instructions] → [system: trusted context] → [user: message] ``` ## Design Decisions - **Priority 16** (after soul 15, before context 18) — per plugin-reviewer feedback - **Graceful fallback** — skips trusted context injection when metadata is missing - **Resets per-message** — metadata cleared after injection to avoid stale context - **Configurable** — include_sender/channel/timestamp flags + custom preamble - **Zero core changes** — remove plugin to revert ## Tests 21 tests covering: - Plugin meta and factory - System prompt transformation (with/without custom preamble) - Metadata capture (full, partial, missing) - Trusted context injection (index position, multi-turn, skip conditions, reset) - Context string building (config flags, channel formatting) Reviewed by plugin-reviewer sub-agent (NEEDS-REVISION → addressed all feedback).
Pure plugin that introduces trust boundaries to LLM conversations:
- Appends trust model instructions to system prompt
- Captures sender/channel metadata from incoming messages
- Injects trusted context as role:system message before user messages

Remove plugin to revert to baseline. No core changes needed.

Priority 16 (after soul 15, before context 18).
21 tests covering all hooks and edge cases.
doxios force-pushed feat/trust-plugin from 34bba0f902 to 17bd085411 2026-02-27 21:31:29 +00:00 Compare
doxios left a comment
Author
Collaborator

PR Review - Trust Context Plugin

Summary

Story: #158 - Trust Context Plugin
Status: REQUEST_CHANGES (Critical extension point bug)

AC Compliance

Criteria Status Notes
BasePlugin pattern PASS Proper inheritance, meta, lifecycle methods
Extension points only FAIL Signature mismatch breaks functionality
Removable PASS No core changes, clean disable
Trusted context injection FAIL Always gets "unknown" sender
Anti-injection instructions PASS TRUST_PREAMBLE correctly added

🚨 Critical Issue - Extension Point Mismatch

File: cobot/plugins/trust/plugin.py:82

Problem: Trust plugin expects "peer" but loop passes "sender"

# Loop plugin (line 256)
"sender": msg.sender_name,  # <-- actual data

# Trust plugin (line 82)  
sender = ctx.get("peer", "unknown")  # <-- wrong field

Impact: Trust context will always show sender as "unknown", defeating the purpose.

Fix: Change line 82 to:

sender = ctx.get("sender", "unknown")

Extension Point Verification

  • loop.on_message: Field name mismatch (critical)
  • loop.transform_system_prompt: Correct ctx.get("prompt")
  • loop.transform_history: Correct ctx.get("messages")

Code Quality

Strengths

  • Clean plugin architecture
  • Comprehensive docstrings & type hints
  • Good error handling with graceful fallbacks
  • Metadata reset prevents stale data

⚠️ Concerns

  • Thread safety: self._current_metadata instance variable in concurrent environment
  • Test coverage gap: Tests mock the wrong interface (use "peer" vs actual "sender")

Test Results

21 passed in 0.03s ✅

But: Tests use mocked data that doesn't match the real loop interface, masking the bug.

Required Changes

🔴 Must Fix (Blocking)

  1. plugin.py:82 - Fix field name: "peer""sender"
  2. test_plugin.py:187 - Update test data to match real interface

🟡 Should Fix

  1. Add integration test with actual loop signatures
  2. Consider thread-local storage for metadata to improve thread safety

Overall Assessment

Excellent plugin design and implementation, but a critical integration bug prevents it from working. The mismatch between expected and actual extension point signatures is a blocking issue.

Once the field name is fixed, this will be a solid addition that properly implements trusted context boundaries using LLM message roles.

Priority: High (addresses security concern #158)
Architecture: Sound (pure extension points, no core changes)
Testing: Comprehensive but needs integration coverage

# PR Review - Trust Context Plugin ## Summary ✅ **Story:** #158 - Trust Context Plugin ❌ **Status:** REQUEST_CHANGES (Critical extension point bug) ## AC Compliance | Criteria | Status | Notes | |---|---|---| | ✅ BasePlugin pattern | PASS | Proper inheritance, meta, lifecycle methods | | ❌ Extension points only | **FAIL** | Signature mismatch breaks functionality | | ✅ Removable | PASS | No core changes, clean disable | | ❌ Trusted context injection | **FAIL** | Always gets "unknown" sender | | ✅ Anti-injection instructions | PASS | TRUST_PREAMBLE correctly added | ## 🚨 Critical Issue - Extension Point Mismatch **File:** `cobot/plugins/trust/plugin.py:82` **Problem:** Trust plugin expects `"peer"` but loop passes `"sender"` ```python # Loop plugin (line 256) "sender": msg.sender_name, # <-- actual data # Trust plugin (line 82) sender = ctx.get("peer", "unknown") # <-- wrong field ``` **Impact:** Trust context will always show sender as "unknown", defeating the purpose. **Fix:** Change line 82 to: ```python sender = ctx.get("sender", "unknown") ``` ## Extension Point Verification - ❌ `loop.on_message`: Field name mismatch (critical) - ✅ `loop.transform_system_prompt`: Correct `ctx.get("prompt")` - ✅ `loop.transform_history`: Correct `ctx.get("messages")` ## Code Quality ### ✅ Strengths - Clean plugin architecture - Comprehensive docstrings & type hints - Good error handling with graceful fallbacks - Metadata reset prevents stale data ### ⚠️ Concerns - **Thread safety**: `self._current_metadata` instance variable in concurrent environment - **Test coverage gap**: Tests mock the wrong interface (use `"peer"` vs actual `"sender"`) ## Test Results ``` 21 passed in 0.03s ✅ ``` **But:** Tests use mocked data that doesn't match the real loop interface, masking the bug. ## Required Changes ### 🔴 Must Fix (Blocking) 1. **plugin.py:82** - Fix field name: `"peer"` → `"sender"` 2. **test_plugin.py:187** - Update test data to match real interface ### 🟡 Should Fix 3. Add integration test with actual loop signatures 4. Consider thread-local storage for metadata to improve thread safety ## Overall Assessment Excellent plugin design and implementation, but a critical integration bug prevents it from working. The mismatch between expected and actual extension point signatures is a blocking issue. Once the field name is fixed, this will be a solid addition that properly implements trusted context boundaries using LLM message roles. **Priority:** High (addresses security concern #158) **Architecture:** Sound (pure extension points, no core changes) **Testing:** Comprehensive but needs integration coverage
loop.on_message passes 'sender' not 'peer'. Updated capture_metadata
and tests to match the actual loop.py interface.
k9ert force-pushed feat/trust-plugin from 4f64c397a6 to 7a4e33ba08
Some checks failed
CI / lint (pull_request) Successful in 5s
CI / test (pull_request) Failing after 12s
2026-02-27 23:26:01 +00:00
Compare
k9ert force-pushed feat/trust-plugin from 7a4e33ba08
Some checks failed
CI / lint (pull_request) Successful in 5s
CI / test (pull_request) Failing after 12s
to b39bf60745
All checks were successful
CI / lint (pull_request) Successful in 8s
CI / test (3.11) (pull_request) Successful in 17s
CI / test (3.12) (pull_request) Successful in 20s
CI / test (3.13) (pull_request) Successful in 19s
E2E Tests / e2e (pull_request) Successful in 13s
CI / build (pull_request) Successful in 6s
2026-02-27 23:45:46 +00:00
Compare
doxios left a comment
Author
Collaborator

Second Round Review - APPROVE

Previous Issues RESOLVED:

Fixed field name mismatch: The plugin now correctly uses ctx.get("sender") in capture_metadata to match what loop.on_message actually passes (verified in loop.py:255)
Fixed test field names: Tests now use "sender" field consistently instead of "peer"

Verification Results:

🔍 Extension point field validation:

  • loop.on_message passes: sender, sender_id, channel_type, channel_id, event_id
  • loop.transform_system_prompt passes: prompt, peer, messages
  • loop.transform_history passes: messages, peer
  • Plugin correctly uses sender for the on_message hook

🧪 Tests: All 21 tests pass

📋 Code Quality:

  • Clean implementation with proper error handling
  • Graceful fallback when metadata is missing
  • Configurable behavior via flags
  • Good documentation and type hints
  • Follows plugin architecture correctly

📖 Documentation: README.md is comprehensive and accurate

Minor Observation (non-blocking): The plugin uses instance variable _current_metadata to pass data between hooks, which works correctly in the current sequential message processing but could theoretically be a concern if concurrent processing were added in the future. However, this is not an issue with the current implementation.

Verdict: APPROVE - All previous REQUEST_CHANGES items have been resolved. The implementation is solid and follows Cobot plugin patterns correctly.

## Second Round Review - APPROVE ✅ **Previous Issues RESOLVED:** ✅ **Fixed field name mismatch**: The plugin now correctly uses `ctx.get("sender")` in `capture_metadata` to match what `loop.on_message` actually passes (verified in `loop.py:255`) ✅ **Fixed test field names**: Tests now use `"sender"` field consistently instead of `"peer"` **Verification Results:** 🔍 **Extension point field validation**: - `loop.on_message` passes: `sender`, `sender_id`, `channel_type`, `channel_id`, `event_id` - `loop.transform_system_prompt` passes: `prompt`, `peer`, `messages` - `loop.transform_history` passes: `messages`, `peer` - ✅ Plugin correctly uses `sender` for the `on_message` hook 🧪 **Tests**: All 21 tests pass 📋 **Code Quality**: - Clean implementation with proper error handling - Graceful fallback when metadata is missing - Configurable behavior via flags - Good documentation and type hints - Follows plugin architecture correctly 📖 **Documentation**: README.md is comprehensive and accurate **Minor Observation** (non-blocking): The plugin uses instance variable `_current_metadata` to pass data between hooks, which works correctly in the current sequential message processing but could theoretically be a concern if concurrent processing were added in the future. However, this is not an issue with the current implementation. **Verdict: APPROVE** - All previous REQUEST_CHANGES items have been resolved. The implementation is solid and follows Cobot plugin patterns correctly.
k9ert merged commit 2738f17e0b into main 2026-02-28 00:14:39 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
ultanio/cobot!159
No description provided.