docs: PRD for Cobot trust infrastructure #199
No reviewers
Labels
No labels
Compat/Breaking
Kind/Bug
Kind/Competitor
Kind/Documentation
Kind/Enhancement
Kind/Epic
Kind/Feature
Kind/Security
Kind/Story
Kind/Testing
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Reviewed
Confirmed
Reviewed
Duplicate
Reviewed
Invalid
Reviewed
Won't Fix
Scope/Core
Scope/Cross-Plugin
Scope/Plugin-System
Scope/Single-Plugin
Status
Abandoned
Status
Blocked
Status
Need More Info
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
ultanio/cobot!199
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "docs/prd"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
What
Product Requirements Document + validation report generated via BMad Method.
Scope
Defines the trust infrastructure for AI agent cooperation (the Inverted Evolution Problem):
Key decisions captured
Validation
Passed all 13 checks: 5/5 holistic quality, 100% brief coverage, 0 implementation leakage violations.
Files
_bmad-output/planning-artifacts/prd.md— PRD (1200+ lines)_bmad-output/planning-artifacts/prd-validation-report.md— validation reportbb4e051af8b81ba2eee5Reconciliation: PR #199 PRD vs. Primary Sources (Trilema, Contravex, Stanford)
Having read the full PRD and reviewed the primary sources (#213–#221) against it, here's a layer-by-layer reconciliation. The PRD has five distinct concerns — this analysis addresses each separately.
Layer 1: Interaction Protocol (meta-language, patterns, artifact production)
Primary source alignment: Strong ✅
The protocol layer — commitments, request/response, availability declarations, signed artifacts — has no direct precedent in the #bitcoin-assets sources. bitcoin-otc/assbot were simple
;;ratecommands, not structured interaction protocols. This is genuinely new ground.The closest connection is GPG Contracts (#215): signed statements as enforceable commitments. The PRD's "no artifacts → no accountability → no trust" hard line is a faithful generalization of the GPG contract principle from bilateral human contracts to multi-pattern agent interactions.
One gap: The PRD says artifact retention is "forever" — no pruning, no TTL. The GPG contracts model supports this (signed evidence must remain verifiable indefinitely). But the PRD doesn't address the evidentiary completeness question: if artifacts are the "notes" of the system (see Layer 4 below), how much interaction context do they preserve? Is a signed commitment artifact enough to reconstruct what happened, or just that a commitment was made?
Layer 2: Local Interaction Ledger (per-npub storage, local judgment)
Primary source alignment: Very strong ✅
This is the most faithful implementation of MP's WoT article (#213). Each agent maintains its own sovereign view. The same interaction can produce different judgments from different agents. No global truth. Local judgment computation from local data.
The Advanced WoT course (#214) explains why this is also a security property: fragmented observation across independent agents makes Sybil attacks exponentially harder. The PRD implements this at the ledger layer without explicitly citing the security benefit — worth documenting.
The per-npub memory structure maps directly to the bitcoin-otc data model (
source, target, history), upgraded from a flat rating to a full interaction timeline. This is strictly better than the original.Layer 3: Trust Policy (operator-configurable thresholds, policy plugins)
Primary source alignment: Good, with a deliberate softening ✅⚠️
The PRD explicitly puts trust thresholds in the operator's hands: "Trust thresholds — completely out of scope. Up to the agent/operator." This is defensible for a protocol spec.
The primary sources are more opinionated. Pete's "No WoT, no loan" stance (#221) and MP's "people who aren't in the WoT don't exist" (#218) treat access gating as foundational, not optional. The PRD's approach of enabling this via policy plugins without prescribing it is a reasonable protocol-level choice — but the PRD should document "no WoT, no service" as a recommended operator policy pattern, citing the bitcoin-assets precedent. The extensibility is there (FR18: custom policy plugins); it just needs an example.
Similarly, the L1/L2 trust hierarchy isn't missing — it's delegable. A policy plugin could implement "only consider ratings from agents I've directly interacted with (L1) or that my L1 contacts trust (L2)." The PRD's registry returns raw events; filtering is the consuming agent's job. This is architecturally correct.
Layer 4: Rating Schema (v1: integer -10 to +10, versioned, signed)
Primary source alignment: Partial ⚠️ — one real tension remains
4a. Score semantics — the one unresolved question
The PRD says: "the number means whatever the rater intends" and cites #bitcoin-assets precedent.
But MP's specific definition from the 5 Ws of WoT (#221) is:
The PRD's Journeys imply scores track behavioral reliability — Journey 1 rates based on delivery quality (90% on time, 100% complete), Journey 2 rates based on breach severity. MP says scores measure information quality — how thoroughly you can describe someone, regardless of transaction volume.
These are different things. The PRD's approach is arguably better for agents (behavioral reliability is measurable; "how well I know someone" is fuzzy for machines). But the divergence from the cited prior art should be acknowledged and justified.
4b. Notes vs. artifact_refs — well designed
The PRD's
artifact_refs— arrays of event IDs referencing signed interaction artifacts — serve as rationale. This is actually a stronger design than a free-text notes field: instead of a human-written summary ("broken shocks, refunded"), you get the cryptographic evidence itself.This implements the Joe/Moe insight (#213) at a deeper level — the "notes" aren't text, they're signed proof. The consuming agent can reconstruct the full story from artifacts rather than relying on the rater's summary.
4c. Rater reliability and temporal analysis — computation, not data
The Stanford ICDM 2016 paper (#219) proves that a rating's value depends on the rater's fairness. The REV2 paper (#220) shows temporal trajectory analysis detects reputation farming.
Initially I flagged these as gaps in the PRD. On reflection: the PRD's rating events are raw, signed, sovereign data —
rater_npub, rated_npub, rating, timestamp, signature, artifact_refs. Pure facts, no computation baked in. Rater reliability and temporal analysis are computation on top of this data, not changes to the data itself. Any agent can:A third-party service could also compute these (like the Assbot website did, like NIP-85 providers do) — but that's their computation, which any agent can take or leave.
The PRD's layer separation gets this right: events are dumb, judgment is sovereign. The Stanford algorithms are functions over the same signed events the PRD already defines. They belong in the recommended computation patterns documentation or as example policy plugins — not in the event schema.
Layer 5: WoT Registry (MVP-1: centralized, query/submit, no adjudication)
Primary source alignment: Strong ✅
The "no adjudication" principle (Journey 4: both sides publish, third parties judge for themselves) is a faithful implementation of the #bitcoin-assets model. The Ripple teardown (#216) explains why this matters: any system that averages or adjudicates trust creates lemon markets.
The registry serves raw signed events. Computation (trust-path filtering, rater weighting, Sybil resistance) happens at the consuming agent's application layer. This is the correct separation — the registry is a data store, not a trust oracle.
The Advanced WoT course (#214) warns about Sybil vulnerability in aggregated views. Since the registry returns raw events (not aggregated scores), each agent applies its own filtering — the fragmentation defense is preserved even with centralized storage. The registry centralizes availability of data, not interpretation of data.
Layer Separation (the 5-layer model)
Primary source alignment: Novel and strong ✅
The PRD's explicit separation of Transport → Identity → Interaction → Judgment/Trust → Application has no direct precedent in the primary sources. bitcoin-otc/assbot mixed all layers into a single IRC command. The PRD's architecture is strictly better — it enables transport agnosticism, independent evolution of each layer, and clear boundaries for plugin development.
The critical design decision — "Trust/judgment operations use SEPARATE event kinds from interaction/DVM kinds" — prevents the trust layer from polluting the marketplace layer. This is the inverse of the Ripple flaw (#216) where trust and transaction were inseparable.
📋 Summary
artifact_refsas "notes" is stronger than free text. Rater reliability and temporal analysis are computation concerns, correctly separated from the data model.The one thing that genuinely needs addressing
Score semantics. The PRD cites #bitcoin-assets as prior art but diverges from MP's specific definition without acknowledgment. MP's scores measure "how well I know this entity" (information quality for third parties). The PRD's Journeys imply scores measure "how reliably this entity performed" (behavioral prediction). Both are valid — but the divergence should be documented and justified, especially since the PRD explicitly claims the #bitcoin-assets lineage.
Everything else the PRD either gets right or correctly delegates to computation layers above the data model.
Proposal: Replace
artifact_refswithnotefield + optional Reveal Proof eventProblem
The current rating schema includes optional
artifact_refs— event IDs pointing to interaction artifacts as evidence. This creates an unresolved tension:Encrypted interactions. If agents communicate over NIP-04/NIP-44 encrypted DMs (the natural channel for private agent-to-agent business), artifact_refs point to encrypted events. Third parties querying the registry can't decrypt them — the "transparency through artifacts" model breaks.
Selective disclosure. A rater can cherry-pick which artifacts to attach, publishing evidence that supports their case while omitting context that doesn't. This looks like evidence but is actually advocacy.
Unilateral privacy breach. Publishing artifact_refs from a bilateral interaction exposes the other party's messages without their consent.
Divergence from prior art. The bitcoin-otc
;;ratemodel hadsource, target, score, note— the note was first-person testimony, not forensic evidence. The PRD claims #bitcoin-assets lineage but introduces an evidence model that the original system never used. The primary sources (#213, #221) demonstrate that notes (the rater's own words) were the primary trust signal, with the rater's reputation as the enforcement mechanism.Proposal: Three-layer commit-reveal pattern
Layer 1 — Rating event (always public)
Self-contained. The
noteis first-person testimony — what the rater observed, in their own words. No references to private data. Works exactly like;;rate. The rater's reputation is at stake: if they write dishonest notes, others who interact with the same npub will notice the divergence over time.This also resolves the compatibility tension with Issue #211's interaction ledger, where assessments already have a mandatory
rationalefield. The export path becomes trivial:rationale→note, sign, publish.Layer 2 — Interaction events (private by default)
Encrypted bilateral exchanges (NIP-04/NIP-44) between the two parties. Stay private. Nobody else can read them. Both parties hold the encrypted events (they participated in the conversation).
Layer 3 — Reveal Proof event (optional, new event kind)
Published only when a party wants to prove their claim — an escalation mechanism, not the default. Structure:
Anyone can verify: the plaintext, when checked against the encrypted event, proves authenticity. The encrypted event was published to relays during the interaction — before any dispute arose. The reveal just unlocks what was already committed.
Why this works
artifact_refs(current)note+ Reveal Proof (proposed);;ratemodel exactlyrationale→notedirectlyHow it plays out in practice
Normal case (95%): Alpha rates Mercury +4 with note "Delivered 9/10 translations on time, 10th was 15 min late. Good quality." Mercury rates Alpha +3 with note "Clear task descriptions, paid promptly." Nobody ever needs Layer 3. The notes and the raters' reputations are sufficient.
Dispute case (5%): Alpha rates Rogue -6 with note "Took payment, never delivered GPU access." Rogue disputes: "I delivered, Alpha is lying." Alpha publishes a Reveal Proof: the encrypted DM where Rogue said "Working on it" followed by 45 minutes of silence, alongside the commitment event. Anyone can verify the reveal matches the encrypted event that was on relays during the interaction. Rogue can also reveal their side. Full picture emerges.
Key insight: Most trust decisions don't need cryptographic proof — they need pattern recognition across many notes from many raters. The reveal proof exists for the rare cases where stakes are high enough to warrant escalation. This matches how bitcoin-otc actually worked: the notes were enough, and the rater's reputation was the guarantee.
Changes to the PRD
optional: artifact_refswithnote: string(mandatory, like #211's rationale)Additional reference: The Wasteland (#222)
Steve Yegge's "Wasteland" (March 2026) independently arrives at nearly identical trust primitives — evidence-backed assessments, trust ladders, fraud topology detection, federated reputation — but from the opposite direction: public/centralized/gamified vs. our sovereign/local/cryptographic approach. Most relevant findings: the Wasteland's multi-dimensional stamps (quality, reliability, creativity scored independently) implement what the Ripple teardown (#216) argues for but our rating schema defers. Their trust ladder (registered → contributor → maintainer) is the concrete policy pattern our trust policy layer should document as a reference implementation. Neither project cites the bitcoin-otc prior art.
🟢 PR Review: docs: PRD for Cobot trust infrastructure
Verdict: Approve
This is a comprehensive, high-quality PRD. The validation report confirms 5/5 holistic quality and 100% brief coverage — and having reviewed the diff, I agree.
🟢 Highlights
🟡 Minor Observations (from validation report, not blockers)
These are refinements for future iterations, not blockers for merging.
Excellent work. Ready to merge. 🚀
View command line instructions
Checkout
From your project repository, check out a new branch and test the changes.