The Price of Precision

“As models get smarter, they can solve problems in fewer steps : less backtracking, less redundant exploration, less verbose reasoning. Claude Opus 4.5 uses dramatically fewer tokens than its predecessors to reach similar or better outcomes.”¹

When Anthropic launched Opus 4.5 in November 2025, the bigger, more expensive model was actually cheaper to use.

On a per-token basis, Opus 4.5 costs 67% more than Sonnet.² But Opus 4.5 used 76% fewer tokens to reach the same outcome.¹ A task that cost $1 on Sonnet cost $0.40 on Opus.

The trend across vendors has been smarter models using fewer tokens per task.

Model	Token Efficiency	Tradeoff
Claude Opus 4.5 vs Sonnet¹	-76%	67% higher per-token cost
GPT-5.4 vs 5.2³	-25%	Responses 24% longer
Gemini 3 vs 2.5⁴	-74%	None measured
Claude Opus 4.7 vs 4.6⁵	+47%	Optimized for code domains

Then Opus 4.7 shipped & the smarter model became much more expensive. The cause : a new tokenizer - software to break text into pieces a computer understands.⁶

How tokenizers break text into pieces - showing unbelievable split into un, belie, vable

Smaller pieces force the model to pay closer attention to each word, like reading a contract word by word instead of skimming paragraphs. The model follows instructions more precisely & makes fewer mistakes on coding tasks. The tradeoff : more tokens, higher costs.

“For text, I’m seeing 1.46x more tokens for the same content. We can expect it to be around 40% more expensive in practice.” - Simon Willison⁷

Boris Cherny, creator of Claude Code, acknowledged Anthropic raised rate limits “to make up for it.”

Will smarter models be increasingly expensive because of greater accuracy or less expensive because they’re smarter? Resolution increases make them more expensive, then efficiency gains reduce costs - a sawtooth pattern. But in every case, this means generating more tokens.

Anthropic, Introducing Claude Opus 4.5 ↩︎ ↩︎ ↩︎
Opus 4.5 : $5/$25 per million tokens vs Sonnet : $3/$15. Anthropic Pricing ↩︎
NxCode, GPT 5.4 Complete Guide 2026 ↩︎
Google Cloud Medium, Gemini 3 vs 2.5 ↩︎
Claude Code Camp, I Measured Claude 4.7’s New Tokenizer ↩︎
Take the word “unbelievable.” A tokenizer might break it into un, believe, & able. This helps the computer understand that the word is the opposite (un) of a core concept (believe) that is possible (able). ↩︎
Simon Willison’s Weblog ↩︎