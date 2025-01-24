Google has found a cheaper way to run AI models, one of the tricks up its sleeve that could give it a long-term edge in the high-stakes race between the largest tech companies, DeepMind co-founder Demis Hassabis said in an interview with Semafor.

For years, the compute power used in generative artificial intelligence was concentrated in the “pre-training” phase, when a raw AI model is initially created. But as models have evolved, the demands of running them — known as inference — have grown.

If an AI model were a brain, inference would be akin to thinking. And it turns out thinking longer can drastically increase a model’s capabilities. That means the compute power available to AI companies today isn’t sufficient to extract the full value of the technology.

Hassabis said new processors — known as “light chips” — are in the works that could make it more cost-effective to run the models.

“Sometimes you have the ‘victim of your own success’ problem,” Hassabis said. “If you build a very performant model, like [Google’s Gemini] 2.0 Flash, everyone wants it, which is great. But then suddenly, you only have a set amount of chips. You need more for serving.”

He said the new Google chips are based on the same architecture as the company’s Tensor Processing Units, a custom-designed AI chip that the company has been working on for around a decade.