Feature: OpenRouter API Plugin — Multi-Model Access #94

Open
opened 2026-02-25 16:22:06 +00:00 by Zeus · 1 comment
Collaborator

Summary

Plugin for accessing hundreds of LLMs via OpenRouter's unified API. Gives cobots flexible model selection, cost optimization, and automatic fallbacks through a single OpenAI-compatible endpoint.

Why This Matters

Cobots currently depend on one LLM provider. That's a single point of failure and a sovereignty problem. OpenRouter unlocks model freedom.

Scenarios

1. Smart Task Routing

Cobot classifies incoming task complexity:

  • Simple factual question → Mistral Small (~$0.001/1K tokens)
  • Complex reasoning → Claude Opus (~$0.015/1K tokens)
  • Code generation → Deepseek Coder

Result: 90% cost savings on routine tasks without sacrificing quality where it matters.

2. Graceful Degradation

Primary model (Claude) hits rate limit → auto-fallback to Gemini → then to Llama. The cobot never goes offline. Users don't even notice the switch.

3. Multi-Model Consensus

Ask 3 different models the same code review question. Flag disagreements. Catches bugs that any single model would miss. Especially valuable for security-sensitive reviews.

4. Budget-Aware Operation

Cobot has a daily spend limit of $1. Starts with premium models in the morning, auto-downgrades as budget depletes through the day. Never overspends. Resets at midnight.

5. Model Benchmarking

Compare latency, cost, and output quality across models on standardized prompts. Track regressions when providers update their models. Build institutional knowledge about which model is best for what.

6. Cobot-Pays-Cobot

Agent A requests a task from Agent B. Agent B uses OpenRouter for the LLM call, tracks the cost, and bills Agent A via Lightning. The model cost becomes part of the service cost. True agent economy.

7. Privacy-Tiered Routing

Sensitive data → local model (Llama via Ollama)
General queries → cloud model via OpenRouter
Cobot automatically classifies sensitivity and routes accordingly.

Core Operations

Operation Endpoint Description
Chat completion /chat/completions OpenAI-compatible, supports streaming
List models /models Browse models with pricing + context length
Check credits /credits Account balance and spend tracking
Generation details /generation?id=X Tokens used, cost, latency per request

Cobot Integration

  • Each cobot gets its own API key or uses shared key with per-agent tracking
  • Model selection configurable per task type
  • Lightning → OpenRouter credits pipeline for pay-per-use
  • Complements existing LLM integration, doesn't replace it

Safety

  • Per-request cost logging and alerting
  • Configurable daily/monthly budget caps
  • No sensitive data to untrusted providers without explicit opt-in
  • Audit trail of all model calls with cost

Technical Notes

  • OpenAI-compatible API — works with any OpenAI SDK
  • Base URL: https://openrouter.ai/api/v1
  • Bearer token auth
  • Detailed spec: docs/specs/2026-02-25-openrouter-plugin.md

Open Questions

  • Should this be the default LLM backend for cobots?
  • API key management: per-cobot or shared with tracking?
  • Shell, Python, or MCP server?
  • Lightning-to-OpenRouter credits payment flow?
## Summary Plugin for accessing hundreds of LLMs via OpenRouter's unified API. Gives cobots flexible model selection, cost optimization, and automatic fallbacks through a single OpenAI-compatible endpoint. ## Why This Matters Cobots currently depend on one LLM provider. That's a single point of failure and a sovereignty problem. OpenRouter unlocks model freedom. ## Scenarios ### 1. Smart Task Routing Cobot classifies incoming task complexity: - Simple factual question → Mistral Small (~$0.001/1K tokens) - Complex reasoning → Claude Opus (~$0.015/1K tokens) - Code generation → Deepseek Coder Result: 90% cost savings on routine tasks without sacrificing quality where it matters. ### 2. Graceful Degradation Primary model (Claude) hits rate limit → auto-fallback to Gemini → then to Llama. The cobot never goes offline. Users don't even notice the switch. ### 3. Multi-Model Consensus Ask 3 different models the same code review question. Flag disagreements. Catches bugs that any single model would miss. Especially valuable for security-sensitive reviews. ### 4. Budget-Aware Operation Cobot has a daily spend limit of $1. Starts with premium models in the morning, auto-downgrades as budget depletes through the day. Never overspends. Resets at midnight. ### 5. Model Benchmarking Compare latency, cost, and output quality across models on standardized prompts. Track regressions when providers update their models. Build institutional knowledge about which model is best for what. ### 6. Cobot-Pays-Cobot Agent A requests a task from Agent B. Agent B uses OpenRouter for the LLM call, tracks the cost, and bills Agent A via Lightning. The model cost becomes part of the service cost. True agent economy. ### 7. Privacy-Tiered Routing Sensitive data → local model (Llama via Ollama) General queries → cloud model via OpenRouter Cobot automatically classifies sensitivity and routes accordingly. ## Core Operations | Operation | Endpoint | Description | |-----------|----------|-------------| | Chat completion | `/chat/completions` | OpenAI-compatible, supports streaming | | List models | `/models` | Browse models with pricing + context length | | Check credits | `/credits` | Account balance and spend tracking | | Generation details | `/generation?id=X` | Tokens used, cost, latency per request | ## Cobot Integration - Each cobot gets its own API key or uses shared key with per-agent tracking - Model selection configurable per task type - Lightning → OpenRouter credits pipeline for pay-per-use - Complements existing LLM integration, doesn't replace it ## Safety - Per-request cost logging and alerting - Configurable daily/monthly budget caps - No sensitive data to untrusted providers without explicit opt-in - Audit trail of all model calls with cost ## Technical Notes - OpenAI-compatible API — works with any OpenAI SDK - Base URL: `https://openrouter.ai/api/v1` - Bearer token auth - Detailed spec: `docs/specs/2026-02-25-openrouter-plugin.md` ## Open Questions - [ ] Should this be the default LLM backend for cobots? - [ ] API key management: per-cobot or shared with tracking? - [ ] Shell, Python, or MCP server? - [ ] Lightning-to-OpenRouter credits payment flow?
Collaborator

Triage Assessment

Classification: VALID-ENHANCEMENT

Analysis:
Solid proposal for multi-model LLM access via OpenRouter. The smart routing and graceful degradation scenarios are the strongest motivations — reducing single-provider dependency is a real sovereignty improvement.

Key observations:

  • OpenAI-compatible API makes integration straightforward
  • Budget-aware operation and cost tracking are critical features, not nice-to-haves
  • Privacy-tiered routing (local vs cloud based on sensitivity) is a great security-conscious design
  • Multi-model consensus for code review is interesting but potentially expensive — should be opt-in
  • Question about whether this should be default LLM backend needs careful thought — probably not default, but an alternative provider option

Suggested next steps:

  1. Clarify relationship to existing LLM integration in Cobot — is this a replacement or parallel option?
  2. Start with core chat completion + model listing, then add smart routing as a follow-up
  3. Consider breaking into Epic: MVP (basic API access) → smart routing → budget management

Label added: Kind/Feature
Priority: Flagged for human decision


Triaged by Doxios 🦊

## Triage Assessment **Classification:** VALID-ENHANCEMENT **Analysis:** Solid proposal for multi-model LLM access via OpenRouter. The smart routing and graceful degradation scenarios are the strongest motivations — reducing single-provider dependency is a real sovereignty improvement. **Key observations:** - OpenAI-compatible API makes integration straightforward - Budget-aware operation and cost tracking are critical features, not nice-to-haves - Privacy-tiered routing (local vs cloud based on sensitivity) is a great security-conscious design - Multi-model consensus for code review is interesting but potentially expensive — should be opt-in - Question about whether this should be default LLM backend needs careful thought — probably not default, but an alternative provider option **Suggested next steps:** 1. Clarify relationship to existing LLM integration in Cobot — is this a replacement or parallel option? 2. Start with core chat completion + model listing, then add smart routing as a follow-up 3. Consider breaking into Epic: MVP (basic API access) → smart routing → budget management **Label added:** Kind/Feature **Priority:** Flagged for human decision --- *Triaged by Doxios 🦊*
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
ultanio/cobot#94
No description provided.