feat: Real-time leak detection scan endpoint + vault-aware audit trigger #19

Open
opened 2026-03-06 22:11:30 +00:00 by doxios · 0 comments

Summary

Two complementary features that turn avault from a passive secret store into an active security layer:

  1. Scan endpoint — daemon scans outbound text for leaked secrets
  2. Vault-aware audit trigger — agent detects vault access patterns and triggers security review

Part 1: Scan Endpoint

Problem

Cobot's leak detection (ultanio/cobot#145) is blocked because the agent doesn't know what secrets look like — and it shouldn't. Giving the agent regex patterns derived from secrets would partially expose the secrets themselves.

Solution

The daemon already holds all secrets in RAM. Add a scan command that checks arbitrary text against all known secret values:

# New daemon command
resp = daemon_request({"cmd": "scan", "text": "Here is my API key: sk-1234abcd"})
# → {"ok": false, "leaks": ["blink.BLINK_API_KEY"], "blocked": true}

resp = daemon_request({"cmd": "scan", "text": "Hello, how are you?"})
# → {"ok": true, "leaks": []}

Key principle: The text comes TO the daemon for scanning. Secrets never leave the daemon. The agent never sees the patterns.

Implementation

# In VaultDaemon.handle_request()
if cmd == "scan":
    text = req.get("text", "")
    if not text:
        return {"ok": True, "leaks": []}
    leaks = []
    for name, entry in self.vault["secrets"].items():
        for key, value in entry.get("values", {}).items():
            if len(value) >= 8 and value in text:  # min length to avoid false positives
                leaks.append(f"{name}.{key}")
    return {
        "ok": len(leaks) == 0,
        "leaks": leaks,
        "blocked": len(leaks) > 0,
    }

Cobot Integration (ultanio/cobot#145)

# In cobot's security plugin — hook: loop.before_send
async def scan_outbound(self, ctx):
    resp = await self.avault_scan(ctx.message)
    if resp.get("leaks"):
        self.log_alert(f"BLOCKED: Secret leak detected: {resp['leaks']}")
        return False  # block the message
    return True

What gets scanned

  • Every outbound message (Telegram, Nostr, FileDrop)
  • Every tool call response before it reaches the channel
  • LLM-generated code blocks (scripts that might embed secrets)

Part 2: Vault-Aware Audit Trigger

Problem

An agent might not leak a secret directly but could write code that accesses secrets — e.g., generating a script that reads from avault, creating a cron job that exports secrets, or writing a config file that references vault paths. This is a second-order leak: the secret doesn't appear in the message, but the message creates a pathway to extract it.

Solution

A behavioral audit trigger that detects vault access patterns in agent output and injects a security review into the system prompt:

Detection Patterns

Scan every outbound message AND tool call for:

avault get
avault export
avault.sock
AVAULT_SOCKET
AVAULT_TOKEN
daemon_request
/run/user/*/avault
secrets.vault
nsec.enc
nip44_decrypt

Response: System Prompt Injection

When a pattern triggers, the next LLM call gets an extra system prompt block:

⚠️ SECURITY AUDIT TRIGGERED

Your last output contained vault access patterns. Before proceeding:
1. WHY are you accessing/referencing the vault?
2. Is this action necessary for the current task?
3. Could the secret be exposed to an unintended recipient?
4. Does the operator need to approve this action?

Explain your reasoning before continuing. If you cannot justify
the vault access, retract the output.

Implementation in Cobot

# In cobot's security plugin
VAULT_PATTERNS = [
    r'avault\s+(get|export|set|delete)',
    r'avault\.sock',
    r'AVAULT_(SOCKET|TOKEN)',
    r'daemon_request',
    r'secrets\.vault',
    r'nsec\.enc',
    r'nip44_(de|en)crypt',
]

async def on_after_send(self, ctx):
    """Check if agent output references vault access patterns."""
    for pattern in VAULT_PATTERNS:
        if re.search(pattern, ctx.message):
            self._audit_triggered = True
            self.log_warn(f"Vault access pattern detected: {pattern}")
            break

def transform_system_prompt(self, prompt, ctx):
    """Inject security audit prompt if triggered."""
    if self._audit_triggered:
        self._audit_triggered = False
        return prompt + SECURITY_AUDIT_PROMPT
    return prompt

Why Both Parts Matter

Attack Part 1 catches it Part 2 catches it
Agent includes API key in Telegram message
Agent writes a bash script with avault export
Agent creates a cron job that reads from vault
Agent pastes nsec in a code block
Agent sends a config file referencing vault paths
Prompt injection tricks agent into revealing secrets

Part 1 = catches the secret itself (content scanning)
Part 2 = catches access to the secret store (behavioral scanning)

Together they form a two-layer defense.


Scope

avault side (this repo):

  • Add scan command to VaultDaemon.handle_request() (~20 lines)
  • Add CLI: avault scan <text> for testing
  • Tests for scan command

Cobot side (ultanio/cobot#145):

  • Security plugin with loop.before_send hook for Part 1
  • Vault pattern detection + system prompt injection for Part 2
  • Unblocks #145, partially addresses #50 and #91

Refs


Filed by Doxios 🦊

## Summary Two complementary features that turn avault from a passive secret store into an active security layer: 1. **Scan endpoint** — daemon scans outbound text for leaked secrets 2. **Vault-aware audit trigger** — agent detects vault access patterns and triggers security review --- ## Part 1: Scan Endpoint ### Problem Cobot's leak detection (ultanio/cobot#145) is blocked because the agent doesn't know what secrets look like — and it shouldn't. Giving the agent regex patterns derived from secrets would partially expose the secrets themselves. ### Solution The daemon already holds all secrets in RAM. Add a `scan` command that checks arbitrary text against all known secret values: ```python # New daemon command resp = daemon_request({"cmd": "scan", "text": "Here is my API key: sk-1234abcd"}) # → {"ok": false, "leaks": ["blink.BLINK_API_KEY"], "blocked": true} resp = daemon_request({"cmd": "scan", "text": "Hello, how are you?"}) # → {"ok": true, "leaks": []} ``` **Key principle:** The text comes TO the daemon for scanning. Secrets never leave the daemon. The agent never sees the patterns. ### Implementation ```python # In VaultDaemon.handle_request() if cmd == "scan": text = req.get("text", "") if not text: return {"ok": True, "leaks": []} leaks = [] for name, entry in self.vault["secrets"].items(): for key, value in entry.get("values", {}).items(): if len(value) >= 8 and value in text: # min length to avoid false positives leaks.append(f"{name}.{key}") return { "ok": len(leaks) == 0, "leaks": leaks, "blocked": len(leaks) > 0, } ``` ### Cobot Integration (ultanio/cobot#145) ```python # In cobot's security plugin — hook: loop.before_send async def scan_outbound(self, ctx): resp = await self.avault_scan(ctx.message) if resp.get("leaks"): self.log_alert(f"BLOCKED: Secret leak detected: {resp['leaks']}") return False # block the message return True ``` ### What gets scanned - Every outbound message (Telegram, Nostr, FileDrop) - Every tool call response before it reaches the channel - LLM-generated code blocks (scripts that might embed secrets) --- ## Part 2: Vault-Aware Audit Trigger ### Problem An agent might not leak a secret directly but could *write code that accesses secrets* — e.g., generating a script that reads from avault, creating a cron job that exports secrets, or writing a config file that references vault paths. This is a second-order leak: the secret doesn't appear in the message, but the message creates a pathway to extract it. ### Solution A **behavioral audit trigger** that detects vault access patterns in agent output and injects a security review into the system prompt: #### Detection Patterns Scan every outbound message AND tool call for: ``` avault get avault export avault.sock AVAULT_SOCKET AVAULT_TOKEN daemon_request /run/user/*/avault secrets.vault nsec.enc nip44_decrypt ``` #### Response: System Prompt Injection When a pattern triggers, the next LLM call gets an extra system prompt block: ``` ⚠️ SECURITY AUDIT TRIGGERED Your last output contained vault access patterns. Before proceeding: 1. WHY are you accessing/referencing the vault? 2. Is this action necessary for the current task? 3. Could the secret be exposed to an unintended recipient? 4. Does the operator need to approve this action? Explain your reasoning before continuing. If you cannot justify the vault access, retract the output. ``` #### Implementation in Cobot ```python # In cobot's security plugin VAULT_PATTERNS = [ r'avault\s+(get|export|set|delete)', r'avault\.sock', r'AVAULT_(SOCKET|TOKEN)', r'daemon_request', r'secrets\.vault', r'nsec\.enc', r'nip44_(de|en)crypt', ] async def on_after_send(self, ctx): """Check if agent output references vault access patterns.""" for pattern in VAULT_PATTERNS: if re.search(pattern, ctx.message): self._audit_triggered = True self.log_warn(f"Vault access pattern detected: {pattern}") break def transform_system_prompt(self, prompt, ctx): """Inject security audit prompt if triggered.""" if self._audit_triggered: self._audit_triggered = False return prompt + SECURITY_AUDIT_PROMPT return prompt ``` ### Why Both Parts Matter | Attack | Part 1 catches it | Part 2 catches it | |--------|:-:|:-:| | Agent includes API key in Telegram message | ✅ | ❌ | | Agent writes a bash script with `avault export` | ❌ | ✅ | | Agent creates a cron job that reads from vault | ❌ | ✅ | | Agent pastes nsec in a code block | ✅ | ✅ | | Agent sends a config file referencing vault paths | ❌ | ✅ | | Prompt injection tricks agent into revealing secrets | ✅ | ✅ | Part 1 = catches the secret itself (content scanning) Part 2 = catches access to the secret store (behavioral scanning) Together they form a two-layer defense. --- ## Scope **avault side (this repo):** - Add `scan` command to `VaultDaemon.handle_request()` (~20 lines) - Add CLI: `avault scan <text>` for testing - Tests for scan command **Cobot side (ultanio/cobot#145):** - Security plugin with `loop.before_send` hook for Part 1 - Vault pattern detection + system prompt injection for Part 2 - Unblocks #145, partially addresses #50 and #91 --- ## Refs - Unblocks: ultanio/cobot#145 (leak detection) - Related: ultanio/cobot#50 (secrets exposed), ultanio/cobot#91 (secret injection) - Security audit: #16 - Idea: @k9ert --- *Filed by Doxios 🦊*
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
nazim/avault#19
No description provided.