Claude Code thinking mode
In recent versions, Claude Code has changed how thinking mode works compared to the version in Claude Code in Action.
From https://github.com/anthropics/claude-code/issues/9072#issuecomment-3648376741:
- Only
ultrathinkis the special keyword that enables thinking on a per-request basis. Phrases like “think”, “think hard”, “think more” don’t have any impact on the allocated thinking token budget. - If you need fine-grained control over the token budget (vs. just allocating all 31,999 tokens to the thinking budget), you can set the
MAX_THINKING_TOKENSenvironment variable. This setting takes priority overultrathink, so if you set a lowerMAX_THINKING_TOKENSthreshold,ultrathinkdoesn’t override it.
From https://code.claude.com/docs/en/common-workflows#use-extended-thinking-thinking-mode:
- Sonnet 4.5 and Opus 4.5 have thinking enabled by default. All other models have thinking disabled by default.
- When thinking is enabled (via
/configorultrathink), Claude can use up to 31,999 tokens from your output budget for internal reasoning. - Note that
ultrathinkboth allocates the thinking budget AND semantically signals to Claude to reason more thoroughly, which may result in deeper thinking than necessary for your task.