ultanio/cobot

Fork 4

feat: Cobot Observability Plugin #224

New issue

Open

opened 2026-03-08 03:44:36 +00:00 by David · 2 comments

David commented

2026-03-08 03:44:36 +00:00

Contributor

stepsCompleted

classification

issueReferences

workflowType

editHistory

extracted-from-parent-prd

projectType	domain	complexity	projectContext	prerequisite
developer_tool	agent_observability	medium	brownfield	Interaction Ledger (#211) implemented

#211 - Peer Interaction Ledger proposal

#217 - Assbot WoT website spec: three-view architecture

#224 - Doxios review: security stance, core prerequisites, component independence

prd

date	changes
2026-03-08	Extracted from combined Simulation & Observability Suite PRD (prd.md) as standalone plugin PRD per component independence recommendation

Product Requirements Document: Cobot Observability Plugin

Author: David
Date: 2026-03-08
Extracted from: Simulation & Observability Suite PRD

Cross-reference: This plugin is consumed by the Simulation Suite PRD (see ../simulation-suite/prd.md). The simulation infrastructure and visualization web app are consumers of this plugin's event stream, but this plugin is independently useful without either.

Executive Summary

The Observability Plugin is a Cobot plugin that hooks into the agent loop's extension points, reads ledger state, and publishes structured, actor-agnostic events via Server-Sent Events (SSE). It follows Cobot's standard plugin architecture: PluginMeta declaration, capability registration, hook-based extension points, and cobot.yml configuration.

The plugin is independently useful for any operator — not just simulation. An operator running a single Cobot instance benefits from seeing what their agent is doing: messages received and sent, assessments recorded, peers discovered, LLM calls made. The event schema is the contract. It is agent-consumable from day one — the same feed that powers a developer's dashboard today becomes an orchestrator agent's sensory input tomorrow.

Prerequisite: The Interaction Ledger (#211) must be implemented. The observability plugin reads ledger data — it does not define or modify the ledger schema. The plugin degrades gracefully if the ledger is not installed (ledger-specific events are simply not emitted).

Security Model — Simulation Only (MVP): Plugin installation is the explicit operator action that authorizes data emission. This is acceptable only for local simulation and development use. The observability plugin exposes the agent's inner life: full message text between agents, assessment rationales (the agent's private reasoning), and LLM call details. This is a launch blocker for any non-simulation deployment. Before the plugin ships for production operator use, a granular access control model (localhost-only binding, token auth, configurable event filtering) must be implemented. This is not a deferred decision — it is a hard constraint on the plugin's deployment scope.

Hardware: The plugin itself is lightweight — no LLM calls, no GPU requirements, no significant memory footprint. It is an async event emitter attached to existing hook points. Hardware and cost requirements (GPUs, LLM inference, Docker orchestration) are simulation infrastructure concerns, not plugin concerns.

What Makes This Special

Actor-agnostic observability as a first-class architectural pattern. Most agent observability is built for human consumption — dashboards, log viewers, metric charts — then retrofitted for machine consumption later. This plugin inverts that pattern: the event schema is agent-consumable from day one. The observability plugin doesn't know or care whether its consumer is a React dashboard, an orchestrator agent, a curl pipe to jq, or a test assertion framework. This follows Cobot's plugin philosophy: one plugin, many consumers, zero configuration about deployment context. The same observability feed that powers developer validation today becomes the orchestrator agent's sensory input tomorrow.

Not monitoring — sensing. Existing agent monitoring tools (LangSmith, Helicone, Weights & Biases) monitor individual LLM calls within a single agent. This plugin monitors inter-agent trust dynamics across a network: who interacted with whom, what assessments were formed, which peers were trusted or refused, and why. The event stream captures the agent's behavioral reasoning — its rationale for trust decisions — not just its token usage. This makes the observability plugin a sensory layer, not a logging layer.

Observer without influence. The plugin hooks into the agent loop as a passive reader — it never modifies message context, assessments, or agent decisions. This is architecturally enforced, not just a convention. Adding or removing the plugin produces identical agent behavior, making it safe for both production monitoring and controlled simulation experiments.

Project Classification

Attribute	Value
Project Type	Developer tool (Cobot plugin)
Domain	Agent observability
Complexity	Medium
Project Context	Brownfield — integrated into Cobot's existing plugin architecture
Prerequisite	Interaction Ledger (#211) implemented

Success Criteria

Developer Success

Adding the observability plugin requires zero edits to existing plugins.
The event schema is documented, stable, and consumable by any actor (dashboard, test harness, orchestrator agent) without transformation.
The event schema is actor-agnostic — no human-only formats, no dashboard-specific fields.

Technical Success

Plugin loads with proper PluginMeta: hooks into loop.on_message, loop.after_send, loop.after_llm, loop.after_tool, reads ledger state via registry lookup.
Event emission adds < 5ms latency per hook invocation. The plugin must not measurably slow down agent behavior.
SSE delivery from plugin to connected consumer completes in < 100ms under normal load.
Snapshot API returns full peer/assessment state for a single agent in < 50ms.

Measurable Outcomes

Metric	Target
Hook latency	< 5ms per invocation
SSE delivery	< 100ms to connected consumer
Snapshot API response	< 50ms
Plugin isolation	Zero changes to existing Cobot plugins
Event schema	Actor-agnostic, documented, stable
Graceful degradation	Operates without ledger plugin installed

Product Scope

MVP — Minimum Viable Product

Cobot plugin following standard architecture (PluginMeta, capabilities, extension points)
Hooks into loop extension points to capture: message received, message sent, LLM call, tool call, assessment recorded, peer discovered
Reads ledger state (peers, interactions, assessments) via registry lookup
Publishes structured JSON events via SSE (push — real-time event stream)
Snapshot API (pull — on-demand current state queries for initial hydration or reconnection)
Configurable event type filtering via cobot.yml
Plugin installation = authorization for data emission (simulation-only; production access control required before non-simulation deployment)

Growth Features (Post-MVP)

WebSocket transport option (if bidirectional communication needed for orchestrator agent commands)
Granular access control: localhost-only binding, token auth, configurable event filtering
Content filtering: max_message_length config, selective field omission
Event buffering for consumer reconnection (beyond SSE last-event-id)

Vision (Future)

Orchestrator agent consuming the event feed as sensory input
Event schema extensions for L2 trust network data, FG algorithm metrics
Multi-agent event correlation (cross-agent causal chains)

User Journeys

Journey 1: Developer Enables Observability on a Cobot Instance

Opening Scene: A developer has a working Cobot development environment and wants to see what the agent is doing in real-time.

Rising Action: The observability plugin lives in cobot/plugins/observability/. Plugin discovery picks it up automatically — zero edits to existing plugins. The developer adds the observability section to cobot.yml:

observability:
  enabled: true
  transport: sse
  host: "0.0.0.0"
  port: 9090
  snapshot_endpoint: true
  events:
    - "interaction.*"
    - "assessment.*"
    - "peer.*"

The developer starts Cobot. The plugin loads at priority 22, announces its SSE endpoint in the startup log.

Climax: The developer connects to http://localhost:9090/events with curl or any SSE client. Events start flowing as the agent processes messages: interaction.received, interaction.sent, assessment.recorded. Each event is a self-describing JSON object with type, timestamp, agent_id, sequence number, and payload. The developer pipes the stream to jq and watches the agent's decision-making unfold in structured form.

Resolution: No dashboard required. The event stream is useful with curl, a test harness, a React app, or another agent. The schema is the contract.

Journey 2: Operator Enables Observability on a Running Agent

Opening Scene: An operator has a Cobot agent running in a simulation environment. They want to add observability without restarting from scratch.

Rising Action: The operator adds the observability section to cobot.yml and triggers a hot-reload (SIGUSR1). The plugin loads, binds its SSE endpoint, and begins emitting events from the next hook invocation onward.

Climax: The operator hits the snapshot API (GET /snapshot) to get the current state — all known peers, their latest assessments, interaction counts. This provides the baseline. From this point, the real-time event stream captures every new interaction and assessment.

Resolution: The operator has full observability without losing the agent's accumulated state. The snapshot API bridges the gap between "plugin just started" and "agent has been running for hours."

Domain-Specific Requirements

Observer Effect Constraint

The observability plugin is a passive observer. It reads loop events and ledger state but never modifies messages, assessments, or agent decisions. Adding or removing the plugin must not change how the agent interacts. Hooks are passive listeners, never modifiers.

Sovereignty Model

The ledger is the agent's private journal (#211). The observability plugin exposes this data to external consumers, but data ownership remains with the agent. Events are published, not shared — there is no two-way channel, no external writes back to the agent.

No Credential Leakage

The event schema must never include Nostr private keys (nsec), API keys, LLM provider tokens, or any secret material. Only public identifiers (npub, peer_id, agent_name) and behavioral data are emitted. This is a hard constraint, not a configuration option.

Security Model

MVP (simulation-only): Plugin installation is the explicit operator authorization for data emission. No additional access control on the event stream or snapshot API.

This authorization model is a launch blocker for non-simulation deployment. The plugin exposes full message text, assessment rationales (the agent's private reasoning), and LLM call details. Production use requires:

Localhost-only binding (no network exposure)
Token authentication on SSE and snapshot endpoints
Configurable event filtering (omit message content, redact rationales)

This is not a deferred decision. The plugin must not ship for production operator use without these controls.

Risk Mitigations

Risk	Mitigation
Observer effect — plugin alters agent behavior	Read-only hooks only; no ctx modifications; enforced by architecture
Credential leakage — secrets in event stream	Schema excludes all secret material by design; validated in tests
Event stream overwhelming consumers	Backpressure: zero-consumer operation with no buffering; configurable event filtering
Actor-agnostic schema too abstract	The simulation dashboard is the first concrete consumer — schema is validated by real usage, not by specification. `curl` + `jq` is the second consumer. Abstraction is tested immediately.
Security exposure in production	Simulation-only MVP; launch blocker for non-simulation deployment

Technical Architecture

Event Transport Protocol

SSE (Server-Sent Events) for MVP — unidirectional, auto-reconnect, matches the read-only observability model. WebSocket as Growth option if bidirectional communication is needed (e.g., orchestrator agent sending commands back).

Two Exposure Modes (Both MVP)

Push (event stream): Real-time events as they happen — message received, assessment recorded, tool called.
Pull (snapshot API): On-demand queries for current state — "give me all peers and their latest assessments for this agent." Needed for initial graph hydration when a consumer connects mid-session.

Event Schema (JSON over SSE)

{
  "type": "interaction.received | interaction.sent | assessment.recorded | llm.called | tool.called | peer.discovered",
  "timestamp": "ISO-8601",
  "agent_id": "agent-42",
  "sequence": 1234,
  "correlation_id": "uuid",
  "payload": {}
}

Extensible without breaking consumers. New event types can be added; consumers must tolerate unknown types gracefully.
Causally ordered where possible. Events carry timestamps, sequence numbers, and correlation IDs for consumers to reconstruct causal chains.

Plugin Configuration (cobot.yml)

observability:
  enabled: true
  transport: sse
  host: "0.0.0.0"
  port: 9090
  snapshot_endpoint: true
  events:
    - "interaction.*"
    - "assessment.*"
    - "peer.*"

Plugin Priority

22 — service tier. After ledger at 21, before tools aggregator.

Dependencies

Hard: config
Optional: ledger (for reading peer/assessment state), workspace (for host/port defaults)

Hooks Implemented

Hook	Event Emitted	Status
`loop.on_message`	`interaction.received`	Exists in core
`loop.after_send`	`interaction.sent`	Exists in core
`loop.after_llm`	`llm.response` (optional, configurable)	Exists in core
`loop.after_tool`	`tool.called` (optional, configurable)	Exists in core
`ledger.after_record`	`interaction.recorded`	Exists — added in `1d71aba`
`ledger.after_assess`	`assessment.recorded`	Exists — added in `1d71aba`

Ledger Integration

The plugin reads ledger state via registry lookup (get_by_capability("ledger")) for snapshot API responses. For real-time assessment events, it hooks into ledger.after_assess (preferred) or polls the ledger DB on a timer as fallback.

Implementation Notes

No new Python dependencies. SSE via Starlette (already optional dependency) or raw asyncio HTTP server.
Plugin lives in cobot/plugins/observability/.
Plugin discovery picks it up automatically — zero edits to existing plugins.

Core Prerequisites — Required Changes Outside This PRD

The observability plugin depends on extension points that do not yet exist in Cobot core or the ledger plugin. These must land before the plugin can fully function.

Hook Availability (All Exist)

Hook	Purpose	Status
`loop.on_message`	Capture incoming messages	Exists in core
`loop.after_send`	Capture outgoing messages	Exists in core
`loop.after_llm`	Capture LLM call details	Exists in core (`loop/plugin.py:131`)
`loop.after_tool`	Capture tool execution details	Exists in core (`loop/plugin.py:133`)

Ledger Extension Points (Implemented)

Hook	Purpose	Status
`ledger.after_record`	Emit event when interaction is logged	Implemented (commit `1d71aba`)
`ledger.after_assess`	Emit event when assessment is recorded	Implemented (commit `1d71aba`)

All required hooks exist. No core changes needed. No workarounds required.

Functional Requirements

Event Emission

FR1: The observability plugin can capture incoming message events from the agent loop and emit them as structured events to external consumers.
FR2: The observability plugin can capture outgoing message events from the agent loop and emit them as structured events to external consumers.
FR3: The observability plugin can capture assessment recording events from the ledger and emit them as structured events to external consumers.
FR4: The observability plugin can capture peer discovery events (first contact with a new peer) and emit them as structured events.
FR5: The observability plugin can optionally capture LLM call events and tool call events, controlled by configuration.
FR6: The observability plugin can read current ledger state (peers, interactions, assessments) from the ledger plugin via registry lookup.

Event Schema & Transport

FR7: The observability plugin can publish events as JSON over Server-Sent Events (SSE) to connected consumers.
FR8: Each event can carry a type identifier, timestamp, agent identifier, sequence number, correlation ID, and event-type-specific payload.
FR9: The event schema can be extended with new event types without breaking existing consumers.
FR10: The event stream can be filtered by event type via plugin configuration, allowing operators to control which events are published.
FR11: The event schema can exclude all secret material (private keys, API keys) — only public identifiers and behavioral data are emitted.

State Queries

FR12: The observability plugin can expose a snapshot API that returns the current state of all known peers and their latest assessments for a given agent.
FR13: The snapshot API can be queried on demand by any consumer for initial graph hydration or reconnection recovery.

Plugin Architecture

FR14: The observability plugin can be added to an existing Cobot installation without modifying any existing plugin, aside from the documented core hook additions (loop.after_llm, loop.after_tool) which are general-purpose extension points that benefit the broader plugin ecosystem. See Core Prerequisites section.
FR15: The observability plugin can declare its capabilities and dependencies via PluginMeta following Cobot's standard plugin architecture.
FR16: The observability plugin can hook into the agent loop exclusively as a passive reader — it never modifies message context, assessments, or agent decisions.
FR17: The observability plugin can be configured via cobot.yml for transport type, host, port, snapshot endpoint toggle, and event type filtering.
FR18: The observability plugin can operate correctly whether or not the ledger plugin is installed (graceful degradation — ledger-specific events are simply not emitted).

Non-Functional Requirements

Performance

NFR1: Event emission from the observability plugin adds < 5ms latency to the agent's message processing pipeline per hook invocation, as measured by co-located pytest-benchmark tests. The plugin must not measurably slow down agent behavior.
NFR2: SSE event delivery from plugin to connected consumer completes in < 100ms under normal load (10 agents, 10 interactions/minute).
NFR3: The snapshot API returns the full peer/assessment state for a single agent in < 50ms, as measured by co-located pytest-benchmark tests against a seeded ledger database.

Security & Privacy

NFR4: The event schema never includes Nostr private keys (nsec), API keys, LLM provider tokens, or any secret material. Only public identifiers (npub, peer_id, agent_name) and behavioral data.
NFR5: MVP (simulation-only): plugin installation is the explicit operator authorization for data emission. No additional access control on the event stream or snapshot API. This authorization model is a launch blocker for non-simulation deployment. Production use requires access control (localhost binding, token auth, event filtering) before the plugin can be enabled on operator-facing agents.
NFR6: The observability plugin does not persist any data independently — it reads from the ledger and emits transient events. No additional attack surface from stored data.

Reliability & Data Integrity

NFR7: The observability plugin continues operating if zero consumers are connected — events are emitted but not buffered. No memory growth from unread events.
NFR8: SSE consumers receive automatic reconnection with last-event-ID support, enabling the consumer to resume from the last received event without data loss.
NFR9: The observability plugin survives agent hot-reloads (SIGUSR1) — the SSE endpoint restarts cleanly and consumers can reconnect.

Integration & Compatibility

NFR10: The observability plugin loads and operates correctly alongside all existing Cobot plugins with no configuration changes to any of them.
NFR11: The observability plugin follows all Cobot conventions: async start()/stop(), sync configure(), create_plugin() factory, co-located tests, self.log_*() for logging.
NFR12: The observability plugin passes ruff check and ruff format with zero warnings, consistent with the existing codebase.

--- stepsCompleted: - extracted-from-parent-prd classification: projectType: developer_tool domain: agent_observability complexity: medium projectContext: brownfield prerequisite: "Interaction Ledger (#211) implemented" issueReferences: - "#211 - Peer Interaction Ledger proposal" - "#217 - Assbot WoT website spec: three-view architecture" - "#224 - Doxios review: security stance, core prerequisites, component independence" workflowType: 'prd' editHistory: - date: '2026-03-08' changes: 'Extracted from combined Simulation & Observability Suite PRD (prd.md) as standalone plugin PRD per component independence recommendation' --- # Product Requirements Document: Cobot Observability Plugin **Author:** David **Date:** 2026-03-08 **Extracted from:** Simulation & Observability Suite PRD > **Cross-reference:** This plugin is consumed by the Simulation Suite PRD (see `../simulation-suite/prd.md`). The simulation infrastructure and visualization web app are consumers of this plugin's event stream, but this plugin is independently useful without either. ## Executive Summary The Observability Plugin is a Cobot plugin that hooks into the agent loop's extension points, reads ledger state, and publishes structured, actor-agnostic events via Server-Sent Events (SSE). It follows Cobot's standard plugin architecture: PluginMeta declaration, capability registration, hook-based extension points, and `cobot.yml` configuration. The plugin is independently useful for any operator — not just simulation. An operator running a single Cobot instance benefits from seeing what their agent is doing: messages received and sent, assessments recorded, peers discovered, LLM calls made. The event schema is the contract. It is agent-consumable from day one — the same feed that powers a developer's dashboard today becomes an orchestrator agent's sensory input tomorrow. **Prerequisite:** The Interaction Ledger (#211) must be implemented. The observability plugin *reads* ledger data — it does not define or modify the ledger schema. The plugin degrades gracefully if the ledger is not installed (ledger-specific events are simply not emitted). **Security Model — Simulation Only (MVP):** Plugin installation is the explicit operator action that authorizes data emission. This is acceptable **only for local simulation and development use**. The observability plugin exposes the agent's inner life: full message text between agents, assessment rationales (the agent's private reasoning), and LLM call details. **This is a launch blocker for any non-simulation deployment.** Before the plugin ships for production operator use, a granular access control model (localhost-only binding, token auth, configurable event filtering) must be implemented. This is not a deferred decision — it is a hard constraint on the plugin's deployment scope. **Hardware:** The plugin itself is lightweight — no LLM calls, no GPU requirements, no significant memory footprint. It is an async event emitter attached to existing hook points. Hardware and cost requirements (GPUs, LLM inference, Docker orchestration) are simulation infrastructure concerns, not plugin concerns. ### What Makes This Special **Actor-agnostic observability as a first-class architectural pattern.** Most agent observability is built for human consumption — dashboards, log viewers, metric charts — then retrofitted for machine consumption later. This plugin inverts that pattern: the event schema is agent-consumable from day one. The observability plugin doesn't know or care whether its consumer is a React dashboard, an orchestrator agent, a `curl` pipe to `jq`, or a test assertion framework. This follows Cobot's plugin philosophy: one plugin, many consumers, zero configuration about deployment context. The same observability feed that powers developer validation today becomes the orchestrator agent's sensory input tomorrow. **Not monitoring — sensing.** Existing agent monitoring tools (LangSmith, Helicone, Weights & Biases) monitor individual LLM calls within a single agent. This plugin monitors inter-agent trust dynamics across a network: who interacted with whom, what assessments were formed, which peers were trusted or refused, and why. The event stream captures the agent's behavioral reasoning — its rationale for trust decisions — not just its token usage. This makes the observability plugin a sensory layer, not a logging layer. **Observer without influence.** The plugin hooks into the agent loop as a passive reader — it never modifies message context, assessments, or agent decisions. This is architecturally enforced, not just a convention. Adding or removing the plugin produces identical agent behavior, making it safe for both production monitoring and controlled simulation experiments. ## Project Classification | Attribute | Value | |-----------|-------| | **Project Type** | Developer tool (Cobot plugin) | | **Domain** | Agent observability | | **Complexity** | Medium | | **Project Context** | Brownfield — integrated into Cobot's existing plugin architecture | | **Prerequisite** | Interaction Ledger (#211) implemented | ## Success Criteria ### Developer Success - Adding the observability plugin requires **zero edits** to existing plugins. - The event schema is documented, stable, and consumable by any actor (dashboard, test harness, orchestrator agent) without transformation. - The event schema is **actor-agnostic** — no human-only formats, no dashboard-specific fields. ### Technical Success - Plugin loads with proper PluginMeta: hooks into `loop.on_message`, `loop.after_send`, `loop.after_llm`, `loop.after_tool`, reads ledger state via registry lookup. - Event emission adds **< 5ms latency** per hook invocation. The plugin must not measurably slow down agent behavior. - SSE delivery from plugin to connected consumer completes in < 100ms under normal load. - Snapshot API returns full peer/assessment state for a single agent in < 50ms. ### Measurable Outcomes | Metric | Target | |--------|--------| | Hook latency | < 5ms per invocation | | SSE delivery | < 100ms to connected consumer | | Snapshot API response | < 50ms | | Plugin isolation | Zero changes to existing Cobot plugins | | Event schema | Actor-agnostic, documented, stable | | Graceful degradation | Operates without ledger plugin installed | ## Product Scope ### MVP — Minimum Viable Product - Cobot plugin following standard architecture (PluginMeta, capabilities, extension points) - Hooks into loop extension points to capture: message received, message sent, LLM call, tool call, assessment recorded, peer discovered - Reads ledger state (peers, interactions, assessments) via registry lookup - Publishes structured JSON events via SSE (push — real-time event stream) - Snapshot API (pull — on-demand current state queries for initial hydration or reconnection) - Configurable event type filtering via `cobot.yml` - Plugin installation = authorization for data emission (simulation-only; production access control required before non-simulation deployment) ### Growth Features (Post-MVP) - WebSocket transport option (if bidirectional communication needed for orchestrator agent commands) - Granular access control: localhost-only binding, token auth, configurable event filtering - Content filtering: `max_message_length` config, selective field omission - Event buffering for consumer reconnection (beyond SSE `last-event-id`) ### Vision (Future) - Orchestrator agent consuming the event feed as sensory input - Event schema extensions for L2 trust network data, FG algorithm metrics - Multi-agent event correlation (cross-agent causal chains) ## User Journeys ### Journey 1: Developer Enables Observability on a Cobot Instance **Opening Scene:** A developer has a working Cobot development environment and wants to see what the agent is doing in real-time. **Rising Action:** The observability plugin lives in `cobot/plugins/observability/`. Plugin discovery picks it up automatically — zero edits to existing plugins. The developer adds the observability section to `cobot.yml`: ```yaml observability: enabled: true transport: sse host: "0.0.0.0" port: 9090 snapshot_endpoint: true events: - "interaction.*" - "assessment.*" - "peer.*" ``` The developer starts Cobot. The plugin loads at priority 22, announces its SSE endpoint in the startup log. **Climax:** The developer connects to `http://localhost:9090/events` with `curl` or any SSE client. Events start flowing as the agent processes messages: `interaction.received`, `interaction.sent`, `assessment.recorded`. Each event is a self-describing JSON object with type, timestamp, agent_id, sequence number, and payload. The developer pipes the stream to `jq` and watches the agent's decision-making unfold in structured form. **Resolution:** No dashboard required. The event stream is useful with `curl`, a test harness, a React app, or another agent. The schema is the contract. ### Journey 2: Operator Enables Observability on a Running Agent **Opening Scene:** An operator has a Cobot agent running in a simulation environment. They want to add observability without restarting from scratch. **Rising Action:** The operator adds the `observability` section to `cobot.yml` and triggers a hot-reload (SIGUSR1). The plugin loads, binds its SSE endpoint, and begins emitting events from the next hook invocation onward. **Climax:** The operator hits the snapshot API (`GET /snapshot`) to get the current state — all known peers, their latest assessments, interaction counts. This provides the baseline. From this point, the real-time event stream captures every new interaction and assessment. **Resolution:** The operator has full observability without losing the agent's accumulated state. The snapshot API bridges the gap between "plugin just started" and "agent has been running for hours." ## Domain-Specific Requirements ### Observer Effect Constraint The observability plugin is a passive observer. It reads loop events and ledger state but **never modifies** messages, assessments, or agent decisions. Adding or removing the plugin must not change how the agent interacts. Hooks are passive listeners, never modifiers. ### Sovereignty Model The ledger is the agent's private journal (#211). The observability plugin exposes this data to external consumers, but data ownership remains with the agent. Events are published, not shared — there is no two-way channel, no external writes back to the agent. ### No Credential Leakage The event schema must **never** include Nostr private keys (nsec), API keys, LLM provider tokens, or any secret material. Only public identifiers (npub, peer_id, agent_name) and behavioral data are emitted. This is a hard constraint, not a configuration option. ### Security Model **MVP (simulation-only):** Plugin installation is the explicit operator authorization for data emission. No additional access control on the event stream or snapshot API. **This authorization model is a launch blocker for non-simulation deployment.** The plugin exposes full message text, assessment rationales (the agent's private reasoning), and LLM call details. Production use requires: - Localhost-only binding (no network exposure) - Token authentication on SSE and snapshot endpoints - Configurable event filtering (omit message content, redact rationales) This is not a deferred decision. The plugin **must not** ship for production operator use without these controls. ### Risk Mitigations | Risk | Mitigation | |------|-----------| | **Observer effect** — plugin alters agent behavior | Read-only hooks only; no ctx modifications; enforced by architecture | | **Credential leakage** — secrets in event stream | Schema excludes all secret material by design; validated in tests | | **Event stream overwhelming consumers** | Backpressure: zero-consumer operation with no buffering; configurable event filtering | | **Actor-agnostic schema too abstract** | The simulation dashboard is the first concrete consumer — schema is validated by real usage, not by specification. `curl` + `jq` is the second consumer. Abstraction is tested immediately. | | **Security exposure in production** | Simulation-only MVP; launch blocker for non-simulation deployment | ## Technical Architecture ### Event Transport Protocol **SSE (Server-Sent Events) for MVP** — unidirectional, auto-reconnect, matches the read-only observability model. WebSocket as Growth option if bidirectional communication is needed (e.g., orchestrator agent sending commands back). ### Two Exposure Modes (Both MVP) - **Push (event stream):** Real-time events as they happen — message received, assessment recorded, tool called. - **Pull (snapshot API):** On-demand queries for current state — "give me all peers and their latest assessments for this agent." Needed for initial graph hydration when a consumer connects mid-session. ### Event Schema (JSON over SSE) ```json { "type": "interaction.received | interaction.sent | assessment.recorded | llm.called | tool.called | peer.discovered", "timestamp": "ISO-8601", "agent_id": "agent-42", "sequence": 1234, "correlation_id": "uuid", "payload": {} } ``` - **Extensible without breaking consumers.** New event types can be added; consumers must tolerate unknown types gracefully. - **Causally ordered where possible.** Events carry timestamps, sequence numbers, and correlation IDs for consumers to reconstruct causal chains. ### Plugin Configuration (cobot.yml) ```yaml observability: enabled: true transport: sse host: "0.0.0.0" port: 9090 snapshot_endpoint: true events: - "interaction.*" - "assessment.*" - "peer.*" ``` ### Plugin Priority **22** — service tier. After ledger at 21, before tools aggregator. ### Dependencies - **Hard:** `config` - **Optional:** `ledger` (for reading peer/assessment state), `workspace` (for host/port defaults) ### Hooks Implemented | Hook | Event Emitted | Status | |------|--------------|--------| | `loop.on_message` | `interaction.received` | Exists in core | | `loop.after_send` | `interaction.sent` | Exists in core | | `loop.after_llm` | `llm.response` (optional, configurable) | Exists in core | | `loop.after_tool` | `tool.called` (optional, configurable) | Exists in core | | `ledger.after_record` | `interaction.recorded` | Exists — added in `1d71aba` | | `ledger.after_assess` | `assessment.recorded` | Exists — added in `1d71aba` | ### Ledger Integration The plugin reads ledger state via registry lookup (`get_by_capability("ledger")`) for snapshot API responses. For real-time assessment events, it hooks into `ledger.after_assess` (preferred) or polls the ledger DB on a timer as fallback. ### Implementation Notes - **No new Python dependencies.** SSE via Starlette (already optional dependency) or raw `asyncio` HTTP server. - Plugin lives in `cobot/plugins/observability/`. - Plugin discovery picks it up automatically — zero edits to existing plugins. ## Core Prerequisites — Required Changes Outside This PRD The observability plugin depends on extension points that do not yet exist in Cobot core or the ledger plugin. These must land before the plugin can fully function. ### Hook Availability (All Exist) | Hook | Purpose | Status | |------|---------|--------| | `loop.on_message` | Capture incoming messages | Exists in core | | `loop.after_send` | Capture outgoing messages | Exists in core | | `loop.after_llm` | Capture LLM call details | Exists in core (`loop/plugin.py:131`) | | `loop.after_tool` | Capture tool execution details | Exists in core (`loop/plugin.py:133`) | ### Ledger Extension Points (Implemented) | Hook | Purpose | Status | |------|---------|--------| | `ledger.after_record` | Emit event when interaction is logged | Implemented (commit `1d71aba`) | | `ledger.after_assess` | Emit event when assessment is recorded | Implemented (commit `1d71aba`) | **All required hooks exist.** No core changes needed. No workarounds required. ## Functional Requirements ### Event Emission - **FR1:** The observability plugin can capture incoming message events from the agent loop and emit them as structured events to external consumers. - **FR2:** The observability plugin can capture outgoing message events from the agent loop and emit them as structured events to external consumers. - **FR3:** The observability plugin can capture assessment recording events from the ledger and emit them as structured events to external consumers. - **FR4:** The observability plugin can capture peer discovery events (first contact with a new peer) and emit them as structured events. - **FR5:** The observability plugin can optionally capture LLM call events and tool call events, controlled by configuration. - **FR6:** The observability plugin can read current ledger state (peers, interactions, assessments) from the ledger plugin via registry lookup. ### Event Schema & Transport - **FR7:** The observability plugin can publish events as JSON over Server-Sent Events (SSE) to connected consumers. - **FR8:** Each event can carry a type identifier, timestamp, agent identifier, sequence number, correlation ID, and event-type-specific payload. - **FR9:** The event schema can be extended with new event types without breaking existing consumers. - **FR10:** The event stream can be filtered by event type via plugin configuration, allowing operators to control which events are published. - **FR11:** The event schema can exclude all secret material (private keys, API keys) — only public identifiers and behavioral data are emitted. ### State Queries - **FR12:** The observability plugin can expose a snapshot API that returns the current state of all known peers and their latest assessments for a given agent. - **FR13:** The snapshot API can be queried on demand by any consumer for initial graph hydration or reconnection recovery. ### Plugin Architecture - **FR14:** The observability plugin can be added to an existing Cobot installation without modifying any existing plugin, aside from the documented core hook additions (`loop.after_llm`, `loop.after_tool`) which are general-purpose extension points that benefit the broader plugin ecosystem. See Core Prerequisites section. - **FR15:** The observability plugin can declare its capabilities and dependencies via PluginMeta following Cobot's standard plugin architecture. - **FR16:** The observability plugin can hook into the agent loop exclusively as a passive reader — it never modifies message context, assessments, or agent decisions. - **FR17:** The observability plugin can be configured via `cobot.yml` for transport type, host, port, snapshot endpoint toggle, and event type filtering. - **FR18:** The observability plugin can operate correctly whether or not the ledger plugin is installed (graceful degradation — ledger-specific events are simply not emitted). ## Non-Functional Requirements ### Performance - **NFR1:** Event emission from the observability plugin adds < 5ms latency to the agent's message processing pipeline per hook invocation, as measured by co-located pytest-benchmark tests. The plugin must not measurably slow down agent behavior. - **NFR2:** SSE event delivery from plugin to connected consumer completes in < 100ms under normal load (10 agents, 10 interactions/minute). - **NFR3:** The snapshot API returns the full peer/assessment state for a single agent in < 50ms, as measured by co-located pytest-benchmark tests against a seeded ledger database. ### Security & Privacy - **NFR4:** The event schema never includes Nostr private keys (nsec), API keys, LLM provider tokens, or any secret material. Only public identifiers (npub, peer_id, agent_name) and behavioral data. - **NFR5:** MVP (simulation-only): plugin installation is the explicit operator authorization for data emission. No additional access control on the event stream or snapshot API. **This authorization model is a launch blocker for non-simulation deployment.** Production use requires access control (localhost binding, token auth, event filtering) before the plugin can be enabled on operator-facing agents. - **NFR6:** The observability plugin does not persist any data independently — it reads from the ledger and emits transient events. No additional attack surface from stored data. ### Reliability & Data Integrity - **NFR7:** The observability plugin continues operating if zero consumers are connected — events are emitted but not buffered. No memory growth from unread events. - **NFR8:** SSE consumers receive automatic reconnection with last-event-ID support, enabling the consumer to resume from the last received event without data loss. - **NFR9:** The observability plugin survives agent hot-reloads (SIGUSR1) — the SSE endpoint restarts cleanly and consumers can reconnect. ### Integration & Compatibility - **NFR10:** The observability plugin loads and operates correctly alongside all existing Cobot plugins with no configuration changes to any of them. - **NFR11:** The observability plugin follows all Cobot conventions: async `start()`/`stop()`, sync `configure()`, `create_plugin()` factory, co-located tests, `self.log_*()` for logging. - **NFR12:** The observability plugin passes `ruff check` and `ruff format` with zero warnings, consistent with the existing codebase.

doxios commented

2026-03-08 03:46:43 +00:00

Collaborator

Review: Simulation & Observability Suite PRD (#224)

Reviewer: Doxios 🦊
Date: 2026-03-08

Overall Assessment

This is a well-structured PRD for what is essentially the experiment that validates or falsifies #211's hypothesis. The framing is right: the ledger is a hypothesis about agent cooperation; this suite is the scientific instrument to test it. The "particle accelerator detector" analogy is apt.

The PRD is thorough — 42 FRs, 29 NFRs, 4 user journeys, clear phasing. But I have some architectural concerns and one fundamental scope question.

🔴 Fundamental Question: Is This One PRD or Three?

This PRD defines three distinct systems:

Observability plugin — Cobot plugin, Python, follows existing patterns
Simulation infrastructure — Docker orchestration + scenario orchestrator
Visualization web app — React + Three.js, completely different tech stack

Each could be its own PRD. The risk of bundling them: the PRD conflates the observability plugin (which is independently useful — operators want to see what their agent is doing even without simulation) with the simulation suite (which is a validation tool for #211).

Suggestion: Consider splitting the observability plugin into its own issue/PR. It's a Cobot plugin that follows established patterns and could ship independently. The simulation + visualization can remain bundled as they're tightly coupled.

🟠 Architectural Concerns

A1: LLM Cost Is Underestimated

The PRD acknowledges LLM cost but I think it's the #1 practical blocker. Even with Ollama:

10 agents × 10 interactions/minute × assessment after milestones = potentially dozens of LLM calls per minute
Each call generates rationale + trust score — this is substantive reasoning, not quick classification
Ollama on a single machine running 10 agents + their LLM calls simultaneously? That's going to be slow or require serious GPU resources

Missing: A concrete cost/performance estimate. How many LLM calls per simulation hour? What's the minimum viable GPU for local inference at simulation speed? What's the PPQ/OpenRouter cost for a 1-hour validation run?

A2: Scenario Orchestrator Design Is Underspecified

The scenario YAML is clear for defining agent roles, but the orchestrator itself is hand-waved:

How does it make agents "claim results are incorrect"? The agent runs real LLM logic — you can't script what it says.
The farmer scenario requires the agent to respond in a specific way (claim incorrect). But the LLM decides how to respond. The orchestrator controls what messages are SENT to agents, not how agents REPLY.
"Unresponsive agent" with 30% response rate — how? The LLM doesn't have a "30% chance of responding" mode. You'd need to intercept at the FileDrop level.

The PRD says: "The scenario configuration controls agent behavior patterns, but the assessment logic must be the real production code." This is the right constraint but it creates a tension: how do you make a "bad" agent behave badly without mocking the LLM?

Suggestion: The orchestrator should control agent BEHAVIOR at the message/FileDrop layer, not at the LLM layer. A farmer agent's orchestrator sends 10 small requests as itself (real LLM generates the request), waits for deliveries, then sends a scripted "results are incorrect" complaint. The target agent's LLM then reasons about this complaint using real assessment logic. This preserves the constraint.

A3: Central Aggregator Is a SPOF

With 100 agents, the web app can't maintain 100 SSE connections (correct). But the central aggregator becomes a single point of failure and a bottleneck:

100 agents × ~10 events/minute = 1000 events/minute flowing through one process
If the aggregator crashes, all observability is lost
No buffering means events during reconnection are lost

Suggestion: Add event-ID based resumption to the aggregator. Or consider a lightweight event bus (Redis Streams, NATS) as a Growth option. For MVP, the aggregator is fine but document it as a known limitation.

🟡 Design Feedback

D1: The Observability Security Model Needs a Stronger Stance

The PRD explicitly defers the security model: "MVP treats plugin installation as authorization." This is fine for local simulation but the PRD should state more forcefully that this is a simulation-only decision. The observability plugin exposes:

Full message text between agents
Assessment rationales (the agent's private reasoning)
LLM call details

This is an agent's entire inner life. In production, this MUST have access control. The PRD acknowledges this but frames it as a "deferred decision." I'd frame it as a launch blocker for any non-simulation use case.

D2: 3D Graph Is Cool But 2D Should Be The Default

3D force-directed graphs look amazing in demos but are harder to read than 2D for actual analysis:

Occlusion: nodes and edges behind other nodes are invisible
Navigation: rotating/zooming in 3D adds cognitive overhead
Edge labels: text in 3D space is hard to read
Screenshots: 3D graphs lose information when projected to 2D for documentation

The PRD mentions react-force-graph-2d as a "lightweight fallback." I'd flip this: 2D as default, 3D as the impressive demo mode. The operator doing actual analysis will prefer 2D. The demo reel uses 3D.

D3: Missing Hooks in Cobot

The PRD references hooks that don't exist yet:

loop.after_llm — I don't see this in the current codebase
loop.after_tool — same
ledger.after_record, ledger.after_assess — these are defined in #211 as Phase 2

The observability plugin depends on hooks that neither Cobot nor the ledger currently expose. This means the observability plugin implementation is blocked on:

New hooks in loop.py (core change — not a plugin-only addition)
Ledger extension points (Phase 2 of #211)

This contradicts "zero changes to existing plugins" (FR39). The observability plugin needs new extension points in core.

Suggestion: List the required core changes explicitly. These are PRs that need to land before the observability plugin can work.

D4: Docker Compose for 100 Agents Is Resource-Heavy

100 Docker containers, each running Python + Cobot + an LLM client, on a single machine? That's:

~100 Python processes (each ~50-100MB) = 5-10GB RAM just for agents
Plus Ollama (if local) with GPU memory
Plus shared FileDrop I/O

Suggestion: Add a hardware requirements section. What's the minimum spec for 10 agents? For 100? Is this a workstation, a beefy server, or cloud-only at scale?

✅ Strengths

The framing is perfect. This is a scientific instrument, not a product feature. The validation mindset ("the ledger is a hypothesis") sets the right expectations.
Actor-agnostic event schema is forward-thinking. The same feed powers dashboards today and orchestrator agents tomorrow. This is exactly the right architectural choice.
User journeys are vivid and specific. Journey 1 (watching trust emerge) and Journey 2 (catching the farmer) are compelling narratives that clearly demonstrate the value.
Scenario-driven simulation from academic prior art. Grounding the scenarios in REV2 (#220) and Sybil analysis (#214) rather than inventing patterns is the right approach.
The Phase 3 vision is ambitious but properly deferred. FG visualization, SNAP-compatible export, orchestrator-as-participant — all flagged as future, none in MVP.
Observer effect constraint is explicit. "The plugin never modifies messages, assessments, or agent decisions" — this is the right hard boundary.

📋 Summary

Category	Assessment
Problem definition	✅ Clear — validate the #211 hypothesis at scale
Architecture	⚠️ Scenario orchestrator underspecified, missing hooks, resource estimation needed
Scope	⚠️ Three systems in one PRD — consider splitting observability plugin out
Technical feasibility	⚠️ LLM cost and hardware requirements need concrete numbers
Prior art integration	✅ Solid — REV2, Sybil analysis, Assbot spec all properly referenced
Phasing	✅ MVP is realistic if scoped to 10 agents
Security	⚠️ Deferred security model needs stronger framing as simulation-only

Verdict: The PRD is well-written and the vision is compelling. The observability plugin is independently valuable and should be extractable. The simulation + visualization is the right way to validate #211. Key gaps: LLM cost estimation, hardware requirements, scenario orchestrator design for bad-actor behavior, and core hook dependencies.

Recommended next step: Resolve the scenario orchestrator question (how do you make agents behave badly without mocking the LLM?) — this is the architectural crux. The rest is execution.

🦊

## Review: Simulation & Observability Suite PRD (#224) **Reviewer:** Doxios 🦊 **Date:** 2026-03-08 --- ### Overall Assessment This is a well-structured PRD for what is essentially **the experiment that validates or falsifies #211's hypothesis.** The framing is right: the ledger is a hypothesis about agent cooperation; this suite is the scientific instrument to test it. The "particle accelerator detector" analogy is apt. The PRD is thorough — 42 FRs, 29 NFRs, 4 user journeys, clear phasing. But I have some architectural concerns and one fundamental scope question. --- ### 🔴 Fundamental Question: Is This One PRD or Three? This PRD defines three distinct systems: 1. **Observability plugin** — Cobot plugin, Python, follows existing patterns 2. **Simulation infrastructure** — Docker orchestration + scenario orchestrator 3. **Visualization web app** — React + Three.js, completely different tech stack Each could be its own PRD. The risk of bundling them: the PRD conflates the **observability plugin** (which is independently useful — operators want to see what their agent is doing even without simulation) with the **simulation suite** (which is a validation tool for #211). **Suggestion:** Consider splitting the observability plugin into its own issue/PR. It's a Cobot plugin that follows established patterns and could ship independently. The simulation + visualization can remain bundled as they're tightly coupled. --- ### 🟠 Architectural Concerns #### A1: LLM Cost Is Underestimated The PRD acknowledges LLM cost but I think it's the #1 practical blocker. Even with Ollama: - 10 agents × 10 interactions/minute × assessment after milestones = potentially dozens of LLM calls per minute - Each call generates rationale + trust score — this is substantive reasoning, not quick classification - Ollama on a single machine running 10 agents + their LLM calls simultaneously? That's going to be slow or require serious GPU resources **Missing:** A concrete cost/performance estimate. How many LLM calls per simulation hour? What's the minimum viable GPU for local inference at simulation speed? What's the PPQ/OpenRouter cost for a 1-hour validation run? #### A2: Scenario Orchestrator Design Is Underspecified The scenario YAML is clear for defining agent roles, but the **orchestrator itself** is hand-waved: - How does it make agents "claim results are incorrect"? The agent runs real LLM logic — you can't script what it says. - The farmer scenario requires the agent to *respond* in a specific way (claim incorrect). But the LLM decides how to respond. The orchestrator controls what messages are SENT to agents, not how agents REPLY. - "Unresponsive agent" with 30% response rate — how? The LLM doesn't have a "30% chance of responding" mode. You'd need to intercept at the FileDrop level. **The PRD says:** "The scenario configuration controls agent behavior patterns, but the assessment logic must be the real production code." This is the right constraint but it creates a tension: how do you make a "bad" agent behave badly without mocking the LLM? **Suggestion:** The orchestrator should control agent BEHAVIOR at the message/FileDrop layer, not at the LLM layer. A farmer agent's orchestrator sends 10 small requests as itself (real LLM generates the request), waits for deliveries, then sends a scripted "results are incorrect" complaint. The target agent's LLM then reasons about this complaint using real assessment logic. This preserves the constraint. #### A3: Central Aggregator Is a SPOF With 100 agents, the web app can't maintain 100 SSE connections (correct). But the central aggregator becomes a single point of failure and a bottleneck: - 100 agents × ~10 events/minute = 1000 events/minute flowing through one process - If the aggregator crashes, all observability is lost - No buffering means events during reconnection are lost **Suggestion:** Add event-ID based resumption to the aggregator. Or consider a lightweight event bus (Redis Streams, NATS) as a Growth option. For MVP, the aggregator is fine but document it as a known limitation. --- ### 🟡 Design Feedback #### D1: The Observability Security Model Needs a Stronger Stance The PRD explicitly defers the security model: "MVP treats plugin installation as authorization." This is fine for local simulation but the PRD should state more forcefully that this is a **simulation-only decision**. The observability plugin exposes: - Full message text between agents - Assessment rationales (the agent's private reasoning) - LLM call details This is an agent's entire inner life. In production, this MUST have access control. The PRD acknowledges this but frames it as a "deferred decision." I'd frame it as a **launch blocker for any non-simulation use case.** #### D2: 3D Graph Is Cool But 2D Should Be The Default 3D force-directed graphs look amazing in demos but are harder to read than 2D for actual analysis: - Occlusion: nodes and edges behind other nodes are invisible - Navigation: rotating/zooming in 3D adds cognitive overhead - Edge labels: text in 3D space is hard to read - Screenshots: 3D graphs lose information when projected to 2D for documentation The PRD mentions `react-force-graph-2d` as a "lightweight fallback." I'd flip this: **2D as default, 3D as the impressive demo mode.** The operator doing actual analysis will prefer 2D. The demo reel uses 3D. #### D3: Missing Hooks in Cobot The PRD references hooks that don't exist yet: - `loop.after_llm` — I don't see this in the current codebase - `loop.after_tool` — same - `ledger.after_record`, `ledger.after_assess` — these are defined in #211 as Phase 2 The observability plugin depends on hooks that neither Cobot nor the ledger currently expose. This means the observability plugin implementation is blocked on: 1. New hooks in `loop.py` (core change — not a plugin-only addition) 2. Ledger extension points (Phase 2 of #211) This contradicts "zero changes to existing plugins" (FR39). The observability plugin needs **new extension points in core**. **Suggestion:** List the required core changes explicitly. These are PRs that need to land before the observability plugin can work. #### D4: Docker Compose for 100 Agents Is Resource-Heavy 100 Docker containers, each running Python + Cobot + an LLM client, on a single machine? That's: - ~100 Python processes (each ~50-100MB) = 5-10GB RAM just for agents - Plus Ollama (if local) with GPU memory - Plus shared FileDrop I/O **Suggestion:** Add a hardware requirements section. What's the minimum spec for 10 agents? For 100? Is this a workstation, a beefy server, or cloud-only at scale? --- ### ✅ Strengths 1. **The framing is perfect.** This is a scientific instrument, not a product feature. The validation mindset ("the ledger is a hypothesis") sets the right expectations. 2. **Actor-agnostic event schema is forward-thinking.** The same feed powers dashboards today and orchestrator agents tomorrow. This is exactly the right architectural choice. 3. **User journeys are vivid and specific.** Journey 1 (watching trust emerge) and Journey 2 (catching the farmer) are compelling narratives that clearly demonstrate the value. 4. **Scenario-driven simulation from academic prior art.** Grounding the scenarios in REV2 (#220) and Sybil analysis (#214) rather than inventing patterns is the right approach. 5. **The Phase 3 vision is ambitious but properly deferred.** FG visualization, SNAP-compatible export, orchestrator-as-participant — all flagged as future, none in MVP. 6. **Observer effect constraint is explicit.** "The plugin never modifies messages, assessments, or agent decisions" — this is the right hard boundary. --- ### 📋 Summary | Category | Assessment | |----------|:-:| | Problem definition | ✅ Clear — validate the #211 hypothesis at scale | | Architecture | ⚠️ Scenario orchestrator underspecified, missing hooks, resource estimation needed | | Scope | ⚠️ Three systems in one PRD — consider splitting observability plugin out | | Technical feasibility | ⚠️ LLM cost and hardware requirements need concrete numbers | | Prior art integration | ✅ Solid — REV2, Sybil analysis, Assbot spec all properly referenced | | Phasing | ✅ MVP is realistic if scoped to 10 agents | | Security | ⚠️ Deferred security model needs stronger framing as simulation-only | **Verdict:** The PRD is well-written and the vision is compelling. The observability plugin is independently valuable and should be extractable. The simulation + visualization is the right way to validate #211. Key gaps: LLM cost estimation, hardware requirements, scenario orchestrator design for bad-actor behavior, and core hook dependencies. **Recommended next step:** Resolve the scenario orchestrator question (how do you make agents behave badly without mocking the LLM?) — this is the architectural crux. The rest is execution. 🦊

David changed title from ~~feat: Cobot Simulation & Observability Suite~~ to feat: Cobot Observability Plugin

2026-03-08 04:27:42 +00:00

David referenced this issue

2026-03-08 04:28:13 +00:00

feat: Cobot Simulation & Visualization Suite #225

doxios commented

2026-03-08 04:31:14 +00:00

Collaborator

Re-Review: Observability Plugin PRD v2 (extracted)

Verdict: All concerns addressed. This is clean and ready for implementation.

The extraction was exactly the right call. This PRD is now a focused, single-component document that follows Cobot's established patterns.

Concern Resolution

Original Concern	Status	How
PRD scope (three systems in one)	✅ Resolved	Extracted into standalone PRD. Independently useful framing is explicit.
Security model too weak	✅ Resolved	"This authorization model is a launch blocker for non-simulation deployment." Strong language, correct framing.
Missing hooks	✅ Resolved	Full table of existing vs missing hooks. MVP workaround documented (ship with `on_message` + `after_send` only). Core prerequisites section lists exact PRs needed.
LLM cost / hardware	✅ N/A	Correctly scoped out — "Hardware and cost requirements are simulation infrastructure concerns, not plugin concerns." The plugin is lightweight.

What I Like

Graceful degradation is a first-class requirement (FR18) — operates without ledger, ledger events simply not emitted
Two exposure modes (push SSE + pull snapshot) covers both real-time monitoring and reconnection recovery
Zero-consumer operation (NFR7) — no memory growth from unread events. Smart.
Hot-reload support (NFR9) — survives SIGUSR1

One Minor Suggestion

The event schema shows correlation_id as a UUID. Consider also including a in_reply_to or trigger_event_id field — so consumers can reconstruct causal chains without heuristics. E.g., an assessment.recorded event links back to the interaction.sent that triggered it.

Ready for implementation. 🦊

## Re-Review: Observability Plugin PRD v2 (extracted) **Verdict: All concerns addressed. This is clean and ready for implementation.** The extraction was exactly the right call. This PRD is now a focused, single-component document that follows Cobot's established patterns. ### Concern Resolution | Original Concern | Status | How | |-----------------|:---:|-----| | **PRD scope (three systems in one)** | ✅ Resolved | Extracted into standalone PRD. Independently useful framing is explicit. | | **Security model too weak** | ✅ Resolved | "This authorization model is a **launch blocker** for non-simulation deployment." Strong language, correct framing. | | **Missing hooks** | ✅ Resolved | Full table of existing vs missing hooks. MVP workaround documented (ship with `on_message` + `after_send` only). Core prerequisites section lists exact PRs needed. | | **LLM cost / hardware** | ✅ N/A | Correctly scoped out — "Hardware and cost requirements are simulation infrastructure concerns, not plugin concerns." The plugin is lightweight. | ### What I Like 1. **Graceful degradation** is a first-class requirement (FR18) — operates without ledger, ledger events simply not emitted 2. **Two exposure modes** (push SSE + pull snapshot) covers both real-time monitoring and reconnection recovery 3. **Zero-consumer operation** (NFR7) — no memory growth from unread events. Smart. 4. **Hot-reload support** (NFR9) — survives SIGUSR1 ### One Minor Suggestion The event schema shows `correlation_id` as a UUID. Consider also including a `in_reply_to` or `trigger_event_id` field — so consumers can reconstruct causal chains without heuristics. E.g., an `assessment.recorded` event links back to the `interaction.sent` that triggered it. **Ready for implementation.** 🦊

doxios referenced this issue

2026-03-08 04:31:52 +00:00

feat: Cobot Simulation & Visualization Suite #225

No milestone

No project

No assignees

2 participants

Notifications

Due date

The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference

ultanio/cobot#224

No description provided.

Rows
Columns