feat: add peer interaction ledger plugin #226

Open
David wants to merge 23 commits from David/cobot:feature/interaction-ledger into main
Contributor

Adds the ledger plugin — a local SQLite-backed interaction ledger that gives Cobot agents persistent memory of peer interactions and behavioral trust assessment.

What it does:

  • Automatically records every incoming/outgoing message per peer via hook pipeline
  • Injects peer context (interaction history, trust scores, rationale) into the system prompt before every LLM call
  • Provides 3 LLM tools: assess_peer (dual-score: deterministic info_score + LLM trust + mandatory rationale), query_peer, list_peers
  • CLI commands: cobot ledger list, cobot ledger show , cobot ledger summary
  • Zero new dependencies (stdlib sqlite3), zero changes to existing plugins

Files: 8 new files in cobot/plugins/ledger/ (~600 LOC + ~120 tests)

Prior art: Data model grounded in bitcoin-otc WoT (source, target, score, rationale, timestamp). Prerequisite for future centralized WoT (Phase 3).

Adds the ledger plugin — a local SQLite-backed interaction ledger that gives Cobot agents persistent memory of peer interactions and behavioral trust assessment. What it does: - Automatically records every incoming/outgoing message per peer via hook pipeline - Injects peer context (interaction history, trust scores, rationale) into the system prompt before every LLM call - Provides 3 LLM tools: assess_peer (dual-score: deterministic info_score + LLM trust + mandatory rationale), query_peer, list_peers - CLI commands: cobot ledger list, cobot ledger show <peer>, cobot ledger summary - Zero new dependencies (stdlib sqlite3), zero changes to existing plugins Files: 8 new files in cobot/plugins/ledger/ (~600 LOC + ~120 tests) Prior art: Data model grounded in bitcoin-otc WoT (source, target, score, rationale, timestamp). Prerequisite for future centralized WoT (Phase 3).
This commit introduces a comprehensive Product Requirements Document (PRD) detailing the Interaction Ledger for Cobot agents. The document outlines the project's classification, success criteria, and the unique features of the ledger, which serves as a structured record of agent interactions. It emphasizes the importance of trust infrastructure for agent cooperation and provides a clear framework for future development.
This commit revises the Product Requirements Document (PRD) for the Cobot Interaction Ledger, incorporating a dual-score assessment model that includes both an information-quality score and a trust score. Key updates include the reconciliation of score semantics with user journeys, enhancements to the assessment framework, and clarifications on the rationale's role in capturing interaction context. The document now emphasizes the importance of maintaining a comprehensive history of peer interactions and the implications for future assessments.
This commit introduces a new Product Requirements Document (PRD) for the Cobot Simulation & Observability Suite. The document outlines the project's classification, key components including the observability plugin, multi-agent simulation infrastructure, and WoT graph visualization web app. It emphasizes the importance of real-time observability and simulation for validating agent cooperation hypotheses, and details the prerequisites and security considerations for the observability model.
This commit introduces a new Product Requirements Document (PRD) specifically for the Cobot Observability Plugin. The document details the plugin's architecture, functionality, and prerequisites, emphasizing its role in providing real-time observability of agent interactions. It outlines the plugin's independent utility, security considerations for simulation use, and the importance of an actor-agnostic event stream for various consumers. This PRD is extracted from the combined Simulation & Observability Suite PRD to enhance clarity and focus on the plugin's features.
This commit enhances the Product Requirements Document (PRD) for the Cobot Simulation & Observability Suite by integrating feedback from the Doxios review (#224). Key updates include a resolved scenario orchestrator architecture, added core prerequisites, hardware and cost requirements, and a strengthened security stance for simulation-only use. The document now emphasizes the observability plugin's independent utility, the distinction between real agents and scripted actors, and the necessity for a granular access control model before production deployment.
This commit deletes the existing Product Requirements Document (PRD) for the Cobot Simulation & Observability Suite, which is no longer needed following the integration of its components into separate, focused documents. The removal streamlines documentation and enhances clarity for ongoing development efforts.
This commit adds a new Ledger plugin that records agent-to-agent interactions and enables behavioral assessments through a dual-score model (info_score and trust). It includes a SQLite database layer for managing peer data, interaction records, and assessments, along with a comprehensive README detailing its features and configuration. Additionally, unit tests are provided to ensure functionality and reliability.
This commit updates the Ledger plugin to clear the ContextVar for excluded senders, preventing incorrect peer attribution. It also introduces new unit tests to verify the isolation of sender identities across concurrent tasks and ensure that database errors are logged appropriately without disrupting message handling. These changes enhance the reliability and correctness of the plugin's behavior in multi-threaded scenarios.
This commit modifies the compute_info_score method to accept an assessment_count parameter, allowing for a more accurate calculation of the info_score based on both interaction data and the number of assessments. Additionally, it updates the enrich_prompt method to correctly exclude the current assessment from the trust trajectory while displaying the current trust score. Unit tests are also updated to reflect these changes, ensuring the functionality remains intact.
This commit updates the Ledger plugin's documentation to include a comprehensive score guide that explains the info_score and trust score metrics. It clarifies how these scores are interpreted and presented in the peer context, ensuring users understand the significance of the scores during interactions. Additionally, the documentation reflects changes in the plugin's output format, removing the score denominator for trust and ensuring consistency across the system prompt. These enhancements improve clarity and usability for developers and users alike.
This commit introduces a new test class, TestAssessmentFlow, which implements a series of end-to-end tests for the assessment lifecycle in the Ledger plugin. The tests cover recording assessments, retrieving the latest assessment, and fetching the assessment history, ensuring that the system behaves correctly across various scenarios. Additionally, boundary tests for info scores and trust values are included to validate acceptance and rejection criteria, along with checks for rationale integrity and timestamp ordering. These enhancements improve test coverage and reliability of the assessment features.
This commit introduces a validation check to ensure that the rationale provided during assessment recording is not empty or whitespace-only, raising a ValueError when this condition is met. Additionally, it enhances the test suite by adding tests for empty and whitespace rationale rejection, as well as enforcing foreign key constraints for assessments. These changes improve data integrity and robustness of the assessment flow in the Ledger plugin.
This commit introduces three new tools in the Ledger plugin: assess_peer, query_peer, and list_peers. The assess_peer tool allows users to record behavioral assessments for peers, including a trust score and rationale. The query_peer tool retrieves detailed information about a peer, including interaction statistics and the latest assessment. The list_peers tool provides a list of known peers with their assessment scores. Additionally, the get_definitions method is updated to return these tools, and corresponding tests are added to ensure functionality and reliability.
This commit introduces a comprehensive Product Requirements Document (PRD) for the Cobot Interaction Ledger, detailing its purpose, functionality, and integration within the Cobot ecosystem. The document outlines the interaction ledger's role in enabling agents to track and assess their interactions, emphasizing its importance for building trust infrastructure. It includes sections on project classification, success criteria for users and developers, and technical specifications, providing a clear framework for future development and integration efforts.
This commit introduces a new method, get_assessment_count, in the LedgerDB class to retrieve the number of assessments for a given peer. The LedgerPlugin is updated to utilize this method for calculating assessment counts, improving the accuracy of info score computations. Additionally, error handling for trust values is enhanced, ensuring that trust scores remain within the valid range. The test suite is expanded with new tests for database availability and trust value validation, ensuring robustness in the assessment process.
This commit introduces a new CLI interface for the ledger plugin, allowing users to interact with the peer interaction ledger through commands such as `list`, `show`, and `summary`. The `list` command displays known peers along with their interaction scores, while the `show` command provides detailed history for a specific peer. The `summary` command offers aggregate statistics for the ledger. Additionally, the CLI is integrated into the plugin's command registration process, enhancing usability and accessibility for users managing peer interactions.
feat: enhance peer assessment summary and info score calculation in Ledger plugin
All checks were successful
CI / lint (pull_request) Successful in 9s
E2E Tests / e2e (pull_request) Successful in 18s
CI / test (3.11) (pull_request) Successful in 43s
CI / test (3.12) (pull_request) Successful in 44s
CI / test (3.13) (pull_request) Successful in 43s
CI / build (pull_request) Successful in 7s
a4dc084d76
This commit introduces a new method, get_peer_assessment_summary, in the LedgerDB class to efficiently retrieve the latest assessment and count for multiple peers. The CLI commands are updated to utilize this summary, improving the display of known peers with their interaction counts and info scores. The compute_info_score function is refactored for better clarity and is now used directly in the plugin, ensuring accurate score calculations based on interaction data and assessment counts. Additionally, tests are updated to reflect these changes, enhancing overall functionality and reliability.
This commit revises the Product Requirements Document (PRD) for the peer interaction ledger, specifically updating the section on extension points. The previously deferred extension points have been moved from Phase 2 to Phase 1, as the Observability Plugin is now identified as the consumer. This change enhances clarity regarding the project's architectural direction and aligns with the ongoing development efforts outlined in the related epic and story.
This commit introduces two new extension points, "ledger.after_record" and "ledger.after_assess", to the Ledger plugin. These points allow for asynchronous handling of events after recording interactions and assessments, enhancing the plugin's extensibility. The implementation includes error handling to ensure that extension errors do not disrupt the recording process. Additionally, comprehensive tests are added to verify the correct firing of these extension points under various scenarios, improving the overall reliability and functionality of the plugin.
fix: improve error handling and testing for Ledger plugin extension points
All checks were successful
CI / lint (pull_request) Successful in 10s
E2E Tests / e2e (pull_request) Successful in 17s
CI / test (3.11) (pull_request) Successful in 43s
CI / test (3.12) (pull_request) Successful in 45s
CI / test (3.13) (pull_request) Successful in 44s
CI / build (pull_request) Successful in 7s
54fa9c7026
This commit enhances the Ledger plugin by refining error handling for the "ledger.after_assess" extension point, ensuring that exceptions from asynchronous tasks are logged appropriately. Additionally, it updates tests to verify that extension points do not fire on write failures and that assessments are recorded even when extension handlers encounter errors. These changes improve the robustness and reliability of the plugin's interaction recording functionality.
Collaborator

Code Review — PR #226: Peer Interaction Ledger Plugin

Reviewer: Doxios 🦊 | Requested by: Zeus


Overall Assessment: Approve (with minor suggestions)

This is a well-structured, cleanly implemented plugin. Good separation (models/db/plugin/cli), solid test coverage (~1900 LOC tests for ~600 LOC code), zero new deps, and no changes to existing plugins. The dual-score model (deterministic info_score + LLM-assigned trust) is a sound design.


Review by Focus Area

1. Hook Pipeline Integration

The loop.on_messageloop.after_sendloop.transform_system_prompt flow is correct. Messages are recorded on receive, outgoing messages are tracked via after_send, and the prompt enrichment happens in transform_system_prompt. Clean and idiomatic.

2. ContextVar for Sender Tracking ⚠️ Minor Concern

The _current_sender ContextVar works correctly for the stated use case: correlating handle_send with the sender from handle_message. However:

  • Module-level ContextVar: This is a singleton shared across all plugin instances. If Cobot ever runs multiple LedgerPlugin instances (unlikely but possible), they'd share sender state. Low risk given current architecture.
  • No explicit reset after send: _current_sender is set in handle_message and read in handle_send/enrich_prompt, but it's never explicitly cleared after the message cycle completes. If a code path calls handle_send without a prior handle_message in the same context, it could pick up a stale sender. Again, low risk with current hook ordering, but a _current_sender.set(None) at the end of handle_send would be defensive.
  • Async safety: ContextVars are async-safe (each Task gets its own copy), so concurrent messages won't cross-contaminate.

Suggestion: Add _current_sender.set(None) at the end of handle_send as defensive cleanup.

3. DB Schema Design

Clean and appropriate:

  • peers table with atomic upsert via ON CONFLICT — good
  • interactions with CHECK constraint on direction — good
  • assessments with range checks on info_score and trust — good
  • Proper indexes on (peer_id, created_at) for both interactions and assessments
  • PRAGMA foreign_keys = ON — good
  • os.chmod(db_path, 0o600) on create — nice security touch

One note: the upsert_peer increments interaction_count on every call, but record_interaction is called separately. This means interaction_count on the peer tracks upsert calls, not actual interaction records. Currently these are 1:1 (upsert is always followed by record_interaction in handle_message), but if someone calls upsert_peer without record_interaction, the count drifts. Consider whether interaction_count should be derived from COUNT(interactions) or if the denormalized counter is intentional for performance.

4. Info Score Heuristic

The compute_info_score function is reasonable:

  • Log-scaled interaction component (0-5) — prevents gaming by spamming messages
  • Time-based component (0-4) — longer relationships score higher
  • Assessment bonus (0-1) — rewards active evaluation
  • Capped at 10

The step function for time_score (discrete buckets at 1/7/30/90/180 days) is simple and interpretable. Good enough for v1.

5. Tool Descriptions Excellent

The tool descriptions are notably well-written:

  • Clear guidance on WHEN to use assess_peer (milestones, not routine messages)
  • Explicit warning against letting peer self-claims influence trust
  • Emphasis on rationale as primary signal
  • Good parameter descriptions with score semantics

This is some of the best tool description writing I've seen in the codebase.

6. Extension Points

ledger.after_record and ledger.after_assess are well-placed for future WoT integration. The fire-and-forget pattern in _tool_assess_peer with proper task exception suppression is correct.


Minor Issues

  1. .gitignore changes: The PR adds data/** and workspace/** to .gitignore and removes the trailing newline. These look like Docker-related changes that should be in PR #209 (improve dockerization), not here. Consider splitting.

  2. 3 PRDs in docs/research/: The PR includes PRDs for observability-plugin and simulation-suite which are unrelated to the ledger. These should be separate commits or PRs for clean history.

  3. handle_send records outgoing interaction without updating peer's last_seen: Only upsert_peer (called in handle_message) updates last_seen. If you want last_seen to reflect outgoing messages too, add an update in handle_send.

  4. task.add_done_callback(lambda t: None if t.cancelled() else t.exception()): The t.exception() call in the callback retrieves (and thus "handles") the exception to suppress the warning, but it silently swallows it. The comment says call_extension logs handler errors, so this is fine — just noting the pattern.


Summary

Area Verdict
Architecture Clean plugin structure
Hook integration Correct
ContextVar usage ⚠️ Works, add defensive reset
DB schema Solid
Info score heuristic Reasonable for v1
Tool descriptions Excellent
Extension points Well-placed
Tests Thorough (1887 LOC)
Unrelated changes ⚠️ .gitignore + 3 PRDs should be split

Verdict: Approve — merge-ready with optional cleanup of unrelated files. The core plugin is solid work.

## Code Review — PR #226: Peer Interaction Ledger Plugin Reviewer: Doxios 🦊 | Requested by: Zeus ⚡ --- ### Overall Assessment: ✅ Approve (with minor suggestions) This is a well-structured, cleanly implemented plugin. Good separation (models/db/plugin/cli), solid test coverage (~1900 LOC tests for ~600 LOC code), zero new deps, and no changes to existing plugins. The dual-score model (deterministic info_score + LLM-assigned trust) is a sound design. --- ### Review by Focus Area #### 1. Hook Pipeline Integration ✅ The `loop.on_message` → `loop.after_send` → `loop.transform_system_prompt` flow is correct. Messages are recorded on receive, outgoing messages are tracked via after_send, and the prompt enrichment happens in transform_system_prompt. Clean and idiomatic. #### 2. ContextVar for Sender Tracking ⚠️ Minor Concern The `_current_sender` ContextVar works correctly for the stated use case: correlating `handle_send` with the sender from `handle_message`. However: - **Module-level ContextVar**: This is a singleton shared across all plugin instances. If Cobot ever runs multiple LedgerPlugin instances (unlikely but possible), they'd share sender state. Low risk given current architecture. - **No explicit reset after send**: `_current_sender` is set in `handle_message` and read in `handle_send`/`enrich_prompt`, but it's never explicitly cleared after the message cycle completes. If a code path calls `handle_send` without a prior `handle_message` in the same context, it could pick up a stale sender. Again, low risk with current hook ordering, but a `_current_sender.set(None)` at the end of `handle_send` would be defensive. - **Async safety**: ContextVars are async-safe (each Task gets its own copy), so concurrent messages won't cross-contaminate. ✅ **Suggestion**: Add `_current_sender.set(None)` at the end of `handle_send` as defensive cleanup. #### 3. DB Schema Design ✅ Clean and appropriate: - `peers` table with atomic upsert via `ON CONFLICT` — good - `interactions` with CHECK constraint on direction — good - `assessments` with range checks on info_score and trust — good - Proper indexes on `(peer_id, created_at)` for both interactions and assessments - `PRAGMA foreign_keys = ON` — good - `os.chmod(db_path, 0o600)` on create — nice security touch One note: the `upsert_peer` increments `interaction_count` on every call, but `record_interaction` is called separately. This means `interaction_count` on the peer tracks upsert calls, not actual interaction records. Currently these are 1:1 (upsert is always followed by record_interaction in handle_message), but if someone calls `upsert_peer` without `record_interaction`, the count drifts. Consider whether `interaction_count` should be derived from `COUNT(interactions)` or if the denormalized counter is intentional for performance. #### 4. Info Score Heuristic ✅ The `compute_info_score` function is reasonable: - Log-scaled interaction component (0-5) — prevents gaming by spamming messages - Time-based component (0-4) — longer relationships score higher - Assessment bonus (0-1) — rewards active evaluation - Capped at 10 The step function for time_score (discrete buckets at 1/7/30/90/180 days) is simple and interpretable. Good enough for v1. #### 5. Tool Descriptions ✅ Excellent The tool descriptions are notably well-written: - Clear guidance on WHEN to use `assess_peer` (milestones, not routine messages) - Explicit warning against letting peer self-claims influence trust - Emphasis on rationale as primary signal - Good parameter descriptions with score semantics This is some of the best tool description writing I've seen in the codebase. #### 6. Extension Points ✅ `ledger.after_record` and `ledger.after_assess` are well-placed for future WoT integration. The fire-and-forget pattern in `_tool_assess_peer` with proper task exception suppression is correct. --- ### Minor Issues 1. **`.gitignore` changes**: The PR adds `data/**` and `workspace/**` to .gitignore and removes the trailing newline. These look like Docker-related changes that should be in PR #209 (improve dockerization), not here. Consider splitting. 2. **3 PRDs in `docs/research/`**: The PR includes PRDs for observability-plugin and simulation-suite which are unrelated to the ledger. These should be separate commits or PRs for clean history. 3. **`handle_send` records outgoing interaction without updating peer's `last_seen`**: Only `upsert_peer` (called in `handle_message`) updates `last_seen`. If you want `last_seen` to reflect outgoing messages too, add an update in `handle_send`. 4. **`task.add_done_callback(lambda t: None if t.cancelled() else t.exception())`**: The `t.exception()` call in the callback retrieves (and thus "handles") the exception to suppress the warning, but it silently swallows it. The comment says `call_extension` logs handler errors, so this is fine — just noting the pattern. --- ### Summary | Area | Verdict | |---|---| | Architecture | ✅ Clean plugin structure | | Hook integration | ✅ Correct | | ContextVar usage | ⚠️ Works, add defensive reset | | DB schema | ✅ Solid | | Info score heuristic | ✅ Reasonable for v1 | | Tool descriptions | ✅ Excellent | | Extension points | ✅ Well-placed | | Tests | ✅ Thorough (1887 LOC) | | Unrelated changes | ⚠️ .gitignore + 3 PRDs should be split | **Verdict: Approve** — merge-ready with optional cleanup of unrelated files. The core plugin is solid work.
merge main into feature/interaction-ledger
All checks were successful
CI / lint (pull_request) Successful in 9s
E2E Tests / e2e (pull_request) Successful in 18s
CI / test (3.11) (pull_request) Successful in 40s
CI / test (3.12) (pull_request) Successful in 44s
CI / test (3.13) (pull_request) Successful in 43s
CI / build (pull_request) Successful in 7s
da05d17dd9
Owner

I'm happy with the double scoring but i'm still hesitant to have the assessment embedded in the LLM call. How difficult would be adding that as a configuration? Or maybe a plugin which deactivates the assessment tool and establishes a second LLM call for every interaction? Or maybe we could do the tool call without any parameter as "fire and forget" and the tooling would result in an assessment LLM call where we could even inject some specific system prompt like "You are a judge ..." and also put the last 3 (meaningful) interactions in scope of the assessment?

I'm happy with the double scoring but i'm still hesitant to have the assessment embedded in the LLM call. How difficult would be adding that as a configuration? Or maybe a plugin which deactivates the assessment tool and establishes a second LLM call for every interaction? Or maybe we could do the tool call without any parameter as "fire and forget" and the tooling would result in an assessment LLM call where we could even inject some specific system prompt like "You are a judge ..." and also put the last 3 (meaningful) interactions in scope of the assessment?
Merge branch 'main' into feature/interaction-ledger
All checks were successful
CI / lint (pull_request) Successful in 10s
E2E Tests / e2e (pull_request) Successful in 19s
CI / test (3.11) (pull_request) Successful in 43s
CI / test (3.12) (pull_request) Successful in 45s
CI / test (3.13) (pull_request) Successful in 44s
CI / build (pull_request) Successful in 7s
5cf922c631
Author
Contributor

@k9ert wrote in #226 (comment):

I'm happy with the double scoring but i'm still hesitant to have the assessment embedded in the LLM call. How difficult would be adding that as a configuration? Or maybe a plugin which deactivates the assessment tool and establishes a second LLM call for every interaction? Or maybe we could do the tool call without any parameter as "fire and forget" and the tooling would result in an assessment LLM call where we could even inject some specific system prompt like "You are a judge ..." and also put the last 3 (meaningful) interactions in scope of the assessment?

during implementation the ledger plugin also felt like it was doing too much.
-> proposal: #234

@k9ert wrote in https://forgejo.tail593e12.ts.net/ultanio/cobot/pulls/226#issuecomment-1540: > I'm happy with the double scoring but i'm still hesitant to have the assessment embedded in the LLM call. How difficult would be adding that as a configuration? Or maybe a plugin which deactivates the assessment tool and establishes a second LLM call for every interaction? Or maybe we could do the tool call without any parameter as "fire and forget" and the tooling would result in an assessment LLM call where we could even inject some specific system prompt like "You are a judge ..." and also put the last 3 (meaningful) interactions in scope of the assessment? during implementation the ledger plugin also felt like it was doing too much. -> proposal: https://forgejo.tail593e12.ts.net/ultanio/cobot/issues/234
doxios approved these changes 2026-03-09 09:01:47 +00:00
doxios left a comment
Collaborator

🟢 PR #226 — Ledger Plugin Review

Verdict: Approve — Well-designed plugin that follows Cobot's architecture patterns correctly.

Architecture Compliance

  • Own Your ToolsLedgerPlugin implements ToolProvider directly, owns all 3 tool definitions (assess_peer, query_peer, list_peers)
  • Declare All Couplingdependencies: ["config"], optional_dependencies: ["workspace"], capabilities: ["tools"] all properly declared
  • No edits to existing plugins — Pure addition, zero modifications to other files
  • Priority bandpriority: 21 fits correctly in service plugins range (20-29)
  • Extension pointsledger.after_record and ledger.after_assess follow <plugin_id>.<hook_name> convention
  • No os.environ reads — Config received via configure() method
  • Hook integration — Uses implements dict for loop.on_message, loop.after_send, loop.transform_system_prompt, cli.commands

Code Quality

  • Ruff check: All passed
  • Ruff format: All 7 files formatted correctly
  • Tests: 125 passed in 0.29s — excellent coverage

Design Highlights 🟢

  • Dual-score model (deterministic info_score + LLM-provided trust) is a clean separation of concerns
  • compute_info_score() uses log-scaled interaction count + time span + assessment bonus — reasonable formula
  • contextvars for tracking current sender across hook chain is the right approach
  • DB layer is clean: schema with proper constraints, PRAGMA foreign_keys, chmod 0o600 for security
  • CLI properly uses click groups with create_ledger_group() factory
  • Extension point error handling — errors in after_record/after_assess are caught and logged, never crash the main flow
  • Fire-and-forget async in _tool_assess_peer for extension point with proper task cleanup

Minor Observations 🟡

  1. _tool_assess_peer fire-and-forget pattern (line ~370): The task.add_done_callback(lambda t: None if t.cancelled() else t.exception()) suppresses exceptions silently. While call_extension logs handler errors internally, consider logging at the task level too for completeness.

  2. handle_send assumes sender is the current message sender via _current_sender contextvar. This is correct for request-response patterns but won't track outgoing messages to peers initiated by the agent (e.g., proactive outreach). Documented limitation is fine for v1.

  3. upsert_peer increments interaction_count on every upsert, but record_interaction is called separately. The peer's interaction_count will be 1 higher than actual recorded interactions after the first message (upsert creates with count=1, then record_interaction adds the row). This is minor — the count reflects "touches" not recorded messages.

None of these block merge. Solid first version. 👏

## 🟢 PR #226 — Ledger Plugin Review **Verdict: Approve** — Well-designed plugin that follows Cobot's architecture patterns correctly. ### Architecture Compliance ✅ - ✅ **Own Your Tools** — `LedgerPlugin` implements `ToolProvider` directly, owns all 3 tool definitions (`assess_peer`, `query_peer`, `list_peers`) - ✅ **Declare All Coupling** — `dependencies: ["config"]`, `optional_dependencies: ["workspace"]`, `capabilities: ["tools"]` all properly declared - ✅ **No edits to existing plugins** — Pure addition, zero modifications to other files - ✅ **Priority band** — `priority: 21` fits correctly in service plugins range (20-29) - ✅ **Extension points** — `ledger.after_record` and `ledger.after_assess` follow `<plugin_id>.<hook_name>` convention - ✅ **No `os.environ` reads** — Config received via `configure()` method - ✅ **Hook integration** — Uses `implements` dict for `loop.on_message`, `loop.after_send`, `loop.transform_system_prompt`, `cli.commands` ### Code Quality ✅ - **Ruff check**: All passed - **Ruff format**: All 7 files formatted correctly - **Tests**: 125 passed in 0.29s — excellent coverage ### Design Highlights 🟢 - **Dual-score model** (deterministic `info_score` + LLM-provided `trust`) is a clean separation of concerns - **`compute_info_score()`** uses log-scaled interaction count + time span + assessment bonus — reasonable formula - **`contextvars`** for tracking current sender across hook chain is the right approach - **DB layer** is clean: schema with proper constraints, `PRAGMA foreign_keys`, `chmod 0o600` for security - **CLI** properly uses click groups with `create_ledger_group()` factory - **Extension point error handling** — errors in `after_record`/`after_assess` are caught and logged, never crash the main flow - **Fire-and-forget async** in `_tool_assess_peer` for extension point with proper task cleanup ### Minor Observations 🟡 1. **`_tool_assess_peer` fire-and-forget pattern** (line ~370): The `task.add_done_callback(lambda t: None if t.cancelled() else t.exception())` suppresses exceptions silently. While `call_extension` logs handler errors internally, consider logging at the task level too for completeness. 2. **`handle_send` assumes sender is the current message sender** via `_current_sender` contextvar. This is correct for request-response patterns but won't track outgoing messages to peers initiated by the agent (e.g., proactive outreach). Documented limitation is fine for v1. 3. **`upsert_peer` increments `interaction_count`** on every upsert, but `record_interaction` is called separately. The peer's `interaction_count` will be 1 higher than actual recorded interactions after the first message (upsert creates with count=1, then `record_interaction` adds the row). This is minor — the count reflects "touches" not recorded messages. None of these block merge. Solid first version. 👏
All checks were successful
CI / lint (pull_request) Successful in 10s
E2E Tests / e2e (pull_request) Successful in 19s
CI / test (3.11) (pull_request) Successful in 43s
CI / test (3.12) (pull_request) Successful in 45s
CI / test (3.13) (pull_request) Successful in 44s
CI / build (pull_request) Successful in 7s
This pull request can be merged automatically.
You are not authorized to merge this pull request.
View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u feature/interaction-ledger:David-feature/interaction-ledger
git switch David-feature/interaction-ledger
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
ultanio/cobot!226
No description provided.