anthropic/claude-sonnet-4-6 or openai/gpt-4.1.
Scan First
models:is a fallback chain, not a single preferred value.- Fallback happens on every live model call.
thinking:isoff|low|medium|high.- Jobs inherit
thinking,context_ratio, andmax_output_tokensfrom the resolved agent. - Jobs may override
modelandmax_iterations, but notthinking.
Fallback Chains
- Operator tries the first model.
- If the call errors, it retries the same request with the next model.
- If a later model succeeds, the run continues normally.
Thinking Levels
offlowmediumhigh
| Operator | LiteLLM |
|---|---|
off | no reasoning request; usually reasoning_effort="none", but omitted for Anthropic compatibility |
low | reasoning_effort="low" |
medium | reasoning_effort="medium" |
high | reasoning_effort="high" |
Cross-Provider Safety
When a response includes provider-specific reasoning metadata, Operator strips that metadata from assistant history before the next model call. This is what keeps Anthropic-to-OpenAI or OpenAI-to-other-provider fallbacks from breaking on incompatible history payloads.Inheritance Rules
| Execution type | Models | Thinking | Other execution settings |
|---|---|---|---|
| Chat | agent override or defaults.models | agent override or defaults.thinking | max_iterations, context_ratio, max_output_tokens resolve the same way |
spawn_agent() without agent= | inherit current run context | inherit current run context | inherit current run context |
spawn_agent(agent="other") | switch to target agent | switch to target agent | switch to target agent |
| Scheduled job | job.model or resolved agent models | resolved agent thinking | resolved agent context_ratio and max_output_tokens |
Jobs
Job frontmatter supports:modelmax_iterations
thinkingcontext_ratiomax_output_tokens
max_output_tokens
max_output_tokens sets LiteLLM max_tokens for each call.
- if configured: Operator uses your explicit value
- if
null: Operator asks LiteLLM for the modelβs max output size and uses that when available - if LiteLLM cannot resolve a max: the param is omitted
context_ratio
context_ratio controls how aggressively Operator trims stored history before each model call.
- higher values keep more history
- lower values reduce token usage
0.0disables trimming logic and sends the full stored history