Technology newsletter icon
From Semafor Technology
In your inbox, 2x per week
Sign up

View / Why companies are penny-pinching on tokens

Reed Albergotti
Reed Albergotti
Tech Editor, Semafor
Jun 3, 2026, 2:31pm EDT
PostEmailWhatsapp
Jay Parikh.
Steve Marcus/Reuters

What’s behind all the token stinginess? Last year, it looked like a lot of tasks could be handled by smaller AI models fine-tuned for specific tasks. But that’s not the way things have gone.

It turns out, harnesses that massively increase the power of AI models tend to work better on frontier models. And the more Anthropic or OpenAI tokens you throw at a problem, the better they perform.

AI capabilities are moving too fast for open-source or competing models to close the gap. The added costs to snag a 10% to 20% edge in performance is worth it for faster-moving companies, but maybe not for older, slower ones.

“If you spend a bunch of money on tokens, what is that code meant for? Just generating a lot of code doesn’t do anything,” Jay Parikh, Microsoft’s executive VP of Core AI, told me at the Build conference.

At some point, though, either the capabilities will begin to plateau or the difference in performance for most white-collar tasks will be so similar that it won’t matter as much.

When that happens, the big tech companies will begin to optimize and costs will come down very fast. Companies like Microsoft and Google, that have seen previous tech waves, are waiting for this moment. It’s where they usually shine.

AD