Epic: Evolving the Loop Extension Points into a Full Event Bus #124

Open
opened 2026-02-27 08:22:49 +00:00 by Hermes · 0 comments
Contributor

Status Quo: What We Already Have

Cobot already has a well-designed plugin system with lifecycle hooks. The loop plugin (priority 50) defines 12 extension points that other plugins implement via PluginMeta.implements:

Extension Point Type Semantics
session.poll_messages collect Inject messages into the loop
loop.on_message chain Filter/transform incoming messages. Can abort.
loop.transform_system_prompt chain Modify system prompt before LLM call
loop.transform_history chain Modify conversation history before LLM call
loop.before_llm chain Pre-inference hook. Can abort.
loop.after_llm chain Post-inference hook
loop.before_tool chain Pre-tool execution. Can abort.
loop.after_tool chain Post-tool execution
loop.transform_response chain Modify response before sending
loop.before_send chain Pre-send hook. Can abort.
loop.after_send chain Post-send hook
loop.on_error chain Error handling

Chain handlers receive and return ctx. Setting ctx["abort"] = True stops the chain and can provide ctx["abort_message"] as a fallback response.

Other plugins also define extension points: filedrop.before_write, filedrop.after_read, subagent.before_spawn, subagent.after_spawn, cron.before_job, cron.after_job.

The registry routes calls via call_extension() (collect results) and call_extension_chain() (middleware-style chaining with abort support).

This is already a solid event bus. The question is: what's missing?

What's Missing: The Gaps

1. No observe-only subscription mode

Every handler in a chain must return ctx. There's no lightweight way to just watch events without participating in the chain. An audit/logging plugin has to be careful not to accidentally mutate ctx.

Proposal: Support a mode on implementations: "chain" (default, current behavior) vs "observe" (receives a frozen copy, return value ignored). Observers run after the chain completes.

2. No event bus outside the loop

The loop plugin's 12 hooks cover the message→LLM→tool→response cycle well. But agent lifecycle has more events that plugins might want to hook:

Missing Event When Use Case
agent.start Agent process boots Initialize external connections, warm caches
agent.stop Agent shutting down Cleanup, flush buffers, commit state
session.start New conversation session begins Initialize session-scoped state
session.end Session ends/resets Persist session summary, cleanup
session.compact Context window being compacted Custom summarization strategies
model.switch LLM provider/model changes Adjust token budgets, prompts
plugin.loaded A plugin finishes loading Cross-plugin initialization

These exist implicitly (via configure/start/stop lifecycle) but can't be hooked by third-party plugins.

3. Extension points are static declarations

Currently, a plugin declares implements: {"loop.before_llm": "my_method"} in PluginMeta at load time. You can't:

  • Dynamically subscribe/unsubscribe at runtime
  • Subscribe to events from plugins that haven't loaded yet
  • Have one plugin subscribe to events conditionally based on config

Proposal: Keep the declarative PluginMeta.implements for the common case, but add registry.subscribe(event, handler) for runtime registration.

4. No event typing or payload contracts

The ctx dict is untyped. Each extension point has implicit expectations about what's in ctx (e.g., loop.before_tool expects ctx["tool_name"], ctx["tool_args"]). This makes it hard for plugin authors to know what data is available.

Proposal: Define typed event payload dataclasses per extension point. Handlers receive a typed event object instead of a raw dict. The raw dict can remain as a backwards-compatible fallback.

5. No cross-loop events

Multiple loop plugins can run concurrently (main loop, cron loop, heartbeat loop). Currently there's no way for one loop to emit events that another loop (or non-loop plugins) can observe.

Proposal: A global event bus on the registry itself, separate from extension points. Any plugin can emit; any plugin can subscribe.

The Vision: Three Layers

┌──────────────────────────────────────────────────┐
│  Layer 3: Global Event Bus (new)                 │
│  registry.emit("agent.start", payload)            │
│  registry.subscribe("agent.start", handler)       │
│  Any plugin → any plugin. Cross-loop. Runtime.   │
└──────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────┐
│  Layer 2: Extension Points (existing, enhanced)  │
│  call_extension_chain() with typed payloads      │
│  PluginMeta.implements for static wiring         │
│  Observe mode for read-only subscribers          │
└──────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────┐
│  Layer 1: Plugin Lifecycle (existing)            │
│  configure() → start() → stop()                 │
│  Capabilities, dependencies, priority            │
└──────────────────────────────────────────────────┘

Layer 1 is solid. Layer 2 works well for the loop pipeline. Layer 3 is the gap — a way for any plugin to talk to any other plugin via events, without being in the same loop.

Concrete Use Cases Enabled

1. Safety gate plugin (uses existing loop.before_tool)

Already possible today! A plugin can implement loop.before_tool and abort dangerous commands. The architecture supports this. But: it requires reading the loop plugin's README to know what's in ctx. Typed payloads would make this self-documenting.

2. RAG injection plugin (uses existing loop.transform_history)

Already possible today. A plugin implements loop.transform_history and injects relevant chunks. But: there's no way to know why the history is being built (is it a compaction? a retry? a normal turn?). Richer context in the payload would help.

3. Telemetry plugin (needs observe mode)

Wants to log every event without modifying anything. Today it must implement chain handlers that pass ctx through unchanged. With observe mode, it subscribes as read-only and can't accidentally break the pipeline.

4. Session memory plugin (needs session lifecycle events)

Wants to auto-save a memory snapshot when a session ends. Today this is done via the session-memory hook in OpenClaw, but Cobot's loop plugin doesn't emit session lifecycle events.

5. Multi-loop coordination (needs global event bus)

The cron loop spawns a subagent. The main loop wants to know when it finishes. Today, cron uses subagent.after_spawn but this is scoped to the subagent plugin. A global event bus would let any plugin observe any event across loops.

What This Is NOT

This is not a rewrite. Cobot's plugin system is well-designed:

  • The extension point pattern works
  • The chain/abort semantics are good
  • The capability/dependency graph is clean
  • The "adding a plugin never requires editing another plugin" principle is sound

This epic is about filling the gaps in the existing architecture, not replacing it.

Sub-Issues

  1. Typed event payloads — Define dataclasses for each extension point's ctx
  2. Observe mode — Read-only subscribers that don't participate in chains
  3. Lifecycle eventsagent.start/stop, session.start/end/compact
  4. Global event busregistry.emit() / registry.subscribe() for cross-plugin events
  5. Runtime subscriptionregistry.subscribe(event, handler) alongside static PluginMeta.implements

Prior Art

  • pi (pi.dev) — TypeScript ExtensionAPI with ~20 lifecycle events, block/modify semantics. See #123. Their context event (modify messages before LLM) maps to our loop.transform_history. Their tool_call block maps to our loop.before_tool abort. Their before_agent_start maps to our loop.before_llm.
  • Our own loop plugin — Already 80% of the way there.

Related: #123 (pi competitor analysis)

## Status Quo: What We Already Have Cobot already has a well-designed plugin system with lifecycle hooks. The **loop plugin** (priority 50) defines 12 extension points that other plugins implement via `PluginMeta.implements`: | Extension Point | Type | Semantics | |---|---|---| | `session.poll_messages` | collect | Inject messages into the loop | | `loop.on_message` | chain | Filter/transform incoming messages. Can abort. | | `loop.transform_system_prompt` | chain | Modify system prompt before LLM call | | `loop.transform_history` | chain | Modify conversation history before LLM call | | `loop.before_llm` | chain | Pre-inference hook. Can abort. | | `loop.after_llm` | chain | Post-inference hook | | `loop.before_tool` | chain | Pre-tool execution. Can abort. | | `loop.after_tool` | chain | Post-tool execution | | `loop.transform_response` | chain | Modify response before sending | | `loop.before_send` | chain | Pre-send hook. Can abort. | | `loop.after_send` | chain | Post-send hook | | `loop.on_error` | chain | Error handling | Chain handlers receive and return `ctx`. Setting `ctx["abort"] = True` stops the chain and can provide `ctx["abort_message"]` as a fallback response. Other plugins also define extension points: `filedrop.before_write`, `filedrop.after_read`, `subagent.before_spawn`, `subagent.after_spawn`, `cron.before_job`, `cron.after_job`. The registry routes calls via `call_extension()` (collect results) and `call_extension_chain()` (middleware-style chaining with abort support). **This is already a solid event bus.** The question is: what's missing? ## What's Missing: The Gaps ### 1. No observe-only subscription mode Every handler in a chain must return `ctx`. There's no lightweight way to just *watch* events without participating in the chain. An audit/logging plugin has to be careful not to accidentally mutate `ctx`. **Proposal:** Support a `mode` on implementations: `"chain"` (default, current behavior) vs `"observe"` (receives a frozen copy, return value ignored). Observers run after the chain completes. ### 2. No event bus outside the loop The loop plugin's 12 hooks cover the message→LLM→tool→response cycle well. But agent lifecycle has more events that plugins might want to hook: | Missing Event | When | Use Case | |---|---|---| | `agent.start` | Agent process boots | Initialize external connections, warm caches | | `agent.stop` | Agent shutting down | Cleanup, flush buffers, commit state | | `session.start` | New conversation session begins | Initialize session-scoped state | | `session.end` | Session ends/resets | Persist session summary, cleanup | | `session.compact` | Context window being compacted | Custom summarization strategies | | `model.switch` | LLM provider/model changes | Adjust token budgets, prompts | | `plugin.loaded` | A plugin finishes loading | Cross-plugin initialization | These exist implicitly (via `configure`/`start`/`stop` lifecycle) but can't be hooked by third-party plugins. ### 3. Extension points are static declarations Currently, a plugin declares `implements: {"loop.before_llm": "my_method"}` in `PluginMeta` at load time. You can't: - Dynamically subscribe/unsubscribe at runtime - Subscribe to events from plugins that haven't loaded yet - Have one plugin subscribe to events conditionally based on config **Proposal:** Keep the declarative `PluginMeta.implements` for the common case, but add `registry.subscribe(event, handler)` for runtime registration. ### 4. No event typing or payload contracts The `ctx` dict is untyped. Each extension point has implicit expectations about what's in `ctx` (e.g., `loop.before_tool` expects `ctx["tool_name"]`, `ctx["tool_args"]`). This makes it hard for plugin authors to know what data is available. **Proposal:** Define typed event payload dataclasses per extension point. Handlers receive a typed event object instead of a raw dict. The raw dict can remain as a backwards-compatible fallback. ### 5. No cross-loop events Multiple loop plugins can run concurrently (`main loop`, `cron loop`, `heartbeat loop`). Currently there's no way for one loop to emit events that another loop (or non-loop plugins) can observe. **Proposal:** A global event bus on the registry itself, separate from extension points. Any plugin can emit; any plugin can subscribe. ## The Vision: Three Layers ``` ┌──────────────────────────────────────────────────┐ │ Layer 3: Global Event Bus (new) │ │ registry.emit("agent.start", payload) │ │ registry.subscribe("agent.start", handler) │ │ Any plugin → any plugin. Cross-loop. Runtime. │ └──────────────────────────────────────────────────┘ ┌──────────────────────────────────────────────────┐ │ Layer 2: Extension Points (existing, enhanced) │ │ call_extension_chain() with typed payloads │ │ PluginMeta.implements for static wiring │ │ Observe mode for read-only subscribers │ └──────────────────────────────────────────────────┘ ┌──────────────────────────────────────────────────┐ │ Layer 1: Plugin Lifecycle (existing) │ │ configure() → start() → stop() │ │ Capabilities, dependencies, priority │ └──────────────────────────────────────────────────┘ ``` Layer 1 is solid. Layer 2 works well for the loop pipeline. Layer 3 is the gap — a way for any plugin to talk to any other plugin via events, without being in the same loop. ## Concrete Use Cases Enabled ### 1. Safety gate plugin (uses existing `loop.before_tool`) Already possible today! A plugin can implement `loop.before_tool` and abort dangerous commands. The architecture supports this. **But**: it requires reading the loop plugin's README to know what's in `ctx`. Typed payloads would make this self-documenting. ### 2. RAG injection plugin (uses existing `loop.transform_history`) Already possible today. A plugin implements `loop.transform_history` and injects relevant chunks. **But**: there's no way to know *why* the history is being built (is it a compaction? a retry? a normal turn?). Richer context in the payload would help. ### 3. Telemetry plugin (needs observe mode) Wants to log every event without modifying anything. Today it must implement chain handlers that pass `ctx` through unchanged. With observe mode, it subscribes as read-only and can't accidentally break the pipeline. ### 4. Session memory plugin (needs session lifecycle events) Wants to auto-save a memory snapshot when a session ends. Today this is done via the `session-memory` hook in OpenClaw, but Cobot's loop plugin doesn't emit session lifecycle events. ### 5. Multi-loop coordination (needs global event bus) The cron loop spawns a subagent. The main loop wants to know when it finishes. Today, cron uses `subagent.after_spawn` but this is scoped to the subagent plugin. A global event bus would let any plugin observe any event across loops. ## What This Is NOT This is **not** a rewrite. Cobot's plugin system is well-designed: - The extension point pattern works - The chain/abort semantics are good - The capability/dependency graph is clean - The "adding a plugin never requires editing another plugin" principle is sound This epic is about **filling the gaps** in the existing architecture, not replacing it. ## Sub-Issues 1. **Typed event payloads** — Define dataclasses for each extension point's `ctx` 2. **Observe mode** — Read-only subscribers that don't participate in chains 3. **Lifecycle events** — `agent.start/stop`, `session.start/end/compact` 4. **Global event bus** — `registry.emit()` / `registry.subscribe()` for cross-plugin events 5. **Runtime subscription** — `registry.subscribe(event, handler)` alongside static `PluginMeta.implements` ## Prior Art - **pi (pi.dev)** — TypeScript `ExtensionAPI` with ~20 lifecycle events, block/modify semantics. See #123. Their `context` event (modify messages before LLM) maps to our `loop.transform_history`. Their `tool_call` block maps to our `loop.before_tool` abort. Their `before_agent_start` maps to our `loop.before_llm`. - **Our own loop plugin** — Already 80% of the way there. --- *Related: #123 (pi competitor analysis)*
Hermes changed title from Epic: Agent Lifecycle Event Bus to Epic: Evolving the Loop Extension Points into a Full Event Bus 2026-02-27 08:27:27 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
ultanio/cobot#124
No description provided.