feat: Conversation history / session memory architecture #236

Open
opened 2026-03-09 15:51:50 +00:00 by doxios · 0 comments
Collaborator

Problem

Cobot's conversation memory across messages is currently handled by the persistence plugin via loop.transform_history. It stores messages per-peer and injects prior turns into the LLM context. The compaction plugin can summarize old history to save tokens.

However, this has limitations that need addressing as we build toward richer agent interactions (ledger, cortex, lurker):

Current State

  • The persistence plugin stores conversation per-peer as JSON files
  • It injects full prior turns between system prompt and current message
  • The compaction plugin can summarize older history to save context window
  • Without persistence enabled, the LLM sees only [system_prompt, current_message] — zero memory

Open Questions

  1. Token efficiency — LLMs with prompt caching (Anthropic, OpenAI) make conversation history cheap if the prefix is stable. Are we taking advantage of this? The system prompt + history prefix should be cache-friendly.

  2. Session boundaries — What defines a "session"? Currently it's per-peer, unbounded. Should sessions expire? Should there be explicit session start/end?

  3. History vs. context — The persistence plugin stores raw message turns. But richer context (ledger peer data, cortex beliefs, tool results) also shapes the conversation. How do these interact? The context.history extension point exists but nothing implements it.

  4. Scalability — Per-peer JSON files work for small-scale. What happens with 50 peers and 1000 messages each?

  5. Cross-plugin coordination — Both loop.transform_history (persistence, compaction) and loop.transform_system_prompt (soul, ledger, cortex) modify what the LLM sees. Is the ordering well-defined? Are there conflicts?

  6. Cache-friendly prompt structure — For LLMs with input caching, the message prefix (system prompt + history) should change as little as possible between calls. Are we structuring messages to maximize cache hits?

Why This Matters Now

  • The ledger needs interaction history to assess peers
  • The cortex needs to observe what happened across messages
  • The lurker archives conversations
  • The communication events (PR #232) provide channel-agnostic message flow

All of these depend on a well-defined conversation memory architecture.

Proposal

Create a PRD that covers:

  • Session lifecycle (start, continue, expire, resume)
  • Token-efficient history injection (leverage LLM prompt caching)
  • Integration with existing plugins (persistence, compaction, ledger, cortex)
  • The relationship between loop.transform_history and context.history extension points

This is a prerequisite for several in-flight features and should be discussed before more plugins build their own history assumptions.


Filed by Doxios 🦊 at David's request

## Problem Cobot's conversation memory across messages is currently handled by the **persistence plugin** via `loop.transform_history`. It stores messages per-peer and injects prior turns into the LLM context. The **compaction plugin** can summarize old history to save tokens. However, this has limitations that need addressing as we build toward richer agent interactions (ledger, cortex, lurker): ### Current State - The persistence plugin stores conversation per-peer as JSON files - It injects full prior turns between system prompt and current message - The compaction plugin can summarize older history to save context window - Without persistence enabled, the LLM sees only `[system_prompt, current_message]` — zero memory ### Open Questions 1. **Token efficiency** — LLMs with prompt caching (Anthropic, OpenAI) make conversation history cheap if the prefix is stable. Are we taking advantage of this? The system prompt + history prefix should be cache-friendly. 2. **Session boundaries** — What defines a "session"? Currently it's per-peer, unbounded. Should sessions expire? Should there be explicit session start/end? 3. **History vs. context** — The persistence plugin stores raw message turns. But richer context (ledger peer data, cortex beliefs, tool results) also shapes the conversation. How do these interact? The `context.history` extension point exists but nothing implements it. 4. **Scalability** — Per-peer JSON files work for small-scale. What happens with 50 peers and 1000 messages each? 5. **Cross-plugin coordination** — Both `loop.transform_history` (persistence, compaction) and `loop.transform_system_prompt` (soul, ledger, cortex) modify what the LLM sees. Is the ordering well-defined? Are there conflicts? 6. **Cache-friendly prompt structure** — For LLMs with input caching, the message prefix (system prompt + history) should change as little as possible between calls. Are we structuring messages to maximize cache hits? ## Why This Matters Now - The **ledger** needs interaction history to assess peers - The **cortex** needs to observe what happened across messages - The **lurker** archives conversations - The **communication events** (PR #232) provide channel-agnostic message flow All of these depend on a well-defined conversation memory architecture. ## Proposal Create a PRD that covers: - Session lifecycle (start, continue, expire, resume) - Token-efficient history injection (leverage LLM prompt caching) - Integration with existing plugins (persistence, compaction, ledger, cortex) - The relationship between `loop.transform_history` and `context.history` extension points This is a prerequisite for several in-flight features and should be discussed before more plugins build their own history assumptions. --- *Filed by Doxios 🦊 at David's request*
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
ultanio/cobot#236
No description provided.