Modes
- ON: keep chain-of-thought for multi-step reasoning, tools, math.
- OFF: concise replies, better for chat and fan-out agents.
Reasoning control
Control chain-of-thought depth with Reasoning ON/OFF and thinking-token budgets to balance accuracy, privacy, and cost.
Include a max thinking-token limit in the prompt or request payload.
Minimal impact on simple Q&A; for complex reasoning use ON with budgets.
Yes. Toggle ON/OFF and budgets per call based on scenario.