Error Handling Strategies
Production agents encounter failures constantly -- API rate limits, network timeouts, malformed LLM output, and tool execution errors. Robust error handling is what separates a demo from a production-ready agent.
Error Classification
Not all errors are equal. CoFounder classifies errors into categories so you can handle each appropriately:
- Retryable -- Rate limits (429), temporary network failures, server errors (500/503). These should be retried with backoff.
- Recoverable -- Tool validation errors, malformed LLM output. The agent can self-correct on the next step.
- Fatal -- Authentication failures (401), missing permissions, invalid configuration. These require human intervention.
import { createAgent, ErrorCategory } from '@waymakerai/aicofounder-core';
const agent = createAgent({
name: 'resilient-agent',
model: 'gpt-4o',
errorHandling: {
classify: (error) => {
if (error.status === 429) return ErrorCategory.RETRYABLE;
if (error.status === 401) return ErrorCategory.FATAL;
if (error.message?.includes('invalid JSON')) return ErrorCategory.RECOVERABLE;
return ErrorCategory.RETRYABLE; // Default to retryable
},
},
tools: [searchTool, databaseTool],
});Retry Logic with Backoff
CoFounder provides built-in retry logic with exponential backoff. Configure it per-agent or per-tool:
import { createAgent } from '@waymakerai/aicofounder-core';
const agent = createAgent({
name: 'retrying-agent',
model: 'gpt-4o',
errorHandling: {
retry: {
maxRetries: 3,
initialDelayMs: 1000,
maxDelayMs: 30000,
backoffMultiplier: 2,
retryableStatuses: [429, 500, 502, 503],
},
onRetry: (error, attempt) => {
console.warn(`Retry attempt ${attempt}: ${error.message}`);
},
},
tools: [searchTool],
});Fallback Models
When a primary model is unavailable or returns errors, CoFounder can automatically switch to a fallback model:
const agent = createAgent({
name: 'fallback-agent',
model: 'gpt-4o',
fallbackModels: [
{ model: 'claude-sonnet-4-20250514', provider: 'anthropic' },
{ model: 'gpt-4o-mini', provider: 'openai' },
],
errorHandling: {
useFallbackOn: [429, 500, 503],
onFallback: (fromModel, toModel, error) => {
console.warn(`Falling back from ${fromModel} to ${toModel}: ${error.message}`);
},
},
});Fallback models are tried in order. If all models fail, the agent raises the final error. Use cheaper or more available models as fallbacks.
Graceful Degradation
Sometimes the best response to an error is a partial result rather than a complete failure. CoFounder's hooks let you implement graceful degradation:
const agent = createAgent({
name: 'graceful-agent',
model: 'gpt-4o',
hooks: {
onToolError: async (toolName, error, context) => {
// If search fails, continue with what we have
if (toolName === 'web_search') {
return {
handled: true,
result: JSON.stringify({
partial: true,
message: 'Web search is temporarily unavailable. Answering with available knowledge.',
}),
};
}
return { handled: false }; // Let other errors propagate
},
onStepError: async (error, step, context) => {
if (step >= context.maxSteps - 1) {
// On last step, return whatever we have
return {
handled: true,
output: 'I was unable to complete the full analysis, but here is what I found so far: ' +
context.partialResults.join('\n'),
};
}
return { handled: false };
},
},
tools: [searchTool, databaseTool],
});User-Friendly Error Messages
End users should never see raw stack traces or technical error codes. Map internal errors to helpful messages:
- Rate limit errors: "I'm processing many requests right now. Please try again in a moment."
- Tool failures: "I wasn't able to access that information, but I can try a different approach."
- Context overflow: "Our conversation has gotten long. Let me summarize what we've discussed and we can continue."
- Model errors: "I encountered an issue generating a response. Let me try again."