Model routing
Adaptive, cost-aware routing: try a cheap model first and escalate to a stronger one only when the result isn't good enough. You pay for the big model only when you need it.
A model string
The simplest model is a 'provider/model' string. The prefix selects the provider package, imported lazily.
model: 'anthropic/claude-opus-4-8'
model: 'openai/gpt-4o'Tiered routing
Wrap a list of tiers with tieredModel. Each tier has an accept predicate that decides whether its result is good enough; if not, Redtuma escalates to the next tier. The last tier is always accepted.
routing.ts
import { Agent } from '@redtuma/core/agent'
import { tieredModel } from '@redtuma/core'
const agent = new Agent({
id: 'assistant',
instructions: 'Answer the question.',
model: tieredModel({
tiers: [
{
model: 'anthropic/claude-haiku-4-5',
accept: (r) => r.text.length > 0 && !r.text.includes('I cannot'),
},
{ model: 'anthropic/claude-opus-4-8' }, // fallback, always accepted
],
onEscalate: ({ from, to }) => console.log(`escalated ${from} -> ${to}`),
}),
})Inspecting the decision
When a tiered policy chose the result, generate reports which tier won and how many attempts it took.
const res = await agent.generate('Summarize this.')
res.routing // { tier: 0, attempts: 1 } — the cheap tier was enough
// { tier: 1, attempts: 2 } — escalated to the strong modelGood accept predicates
- Reject empty output or refusals.
- Reject when structured output fails to parse / validate.
- Gate on a confidence score your tool returns.
- Gate on length, format, or required keywords.
Streaming
Because a stream can't be judged before it commits,
stream uses the cheapest tier. Adaptive escalation applies to generate.