bug: LLM rate limit errors surface raw error to user instead of falling back #235
Labels
No labels
Compat/Breaking
Kind/Bug
Kind/Competitor
Kind/Documentation
Kind/Enhancement
Kind/Epic
Kind/Feature
Kind/Security
Kind/Story
Kind/Testing
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Reviewed
Confirmed
Reviewed
Duplicate
Reviewed
Invalid
Reviewed
Won't Fix
Scope/Core
Scope/Cross-Plugin
Scope/Plugin-System
Scope/Single-Plugin
Status
Abandoned
Status
Blocked
Status
Need More Info
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
ultanio/cobot#235
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Description
When the primary LLM provider is rate limited (HTTP 429) or temporarily unavailable (503), the agent surfaces a raw/unhelpful error message to the user. In Telegram this shows as:
This is a poor user experience across all channels (Telegram, Matrix, CLI, etc.) and breaks conversational flow.
Root Cause
The PPQ plugin (
cobot/plugins/ppq/plugin.py) has no specific handling for rate limit (429) or service unavailable (503) responses. All HTTP errors are caught as genericLLMErrorand bubbled up through the loop plugin, which returns a generic error message with an error reference ID. There is no retry logic, backoff, or fallback mechanism.Expected Behavior
When the primary LLM is rate limited or unavailable, the agent should automatically retry with a configurable fallback LLM provider/model so the user experiences no interruption. The fallback should be transparent to the user.
Proposed Solution
fallbackLLM config tocobot.yml:Default fallback model:
openai/gpt-4.1-mini— strong reasoning, fast, and significantly cheaper than Claude Sonnet 4 (~$0.40/1M input vs ~$3/1M input).Implementation in the LLM provider layer (channel-agnostic):
llm.fallback_triggeredextension point event for observabilityAffected files:
cobot/plugins/ppq/plugin.py— add fallback logic and rate-limit detectioncobot/plugins/config/plugin.py— parse newfallbackconfig blockcobot/plugins/interfaces.py— possibly extendLLMErrorwith error type (rate_limit, unavailable, etc.)cobot/plugins/loop/plugin.py— update error handling if neededAcceptance Criteria
cobot.ymlopenai/gpt-4.1-miniwhen no fallback is configuredllm.fallback_triggeredevent emitted for observability plugin