With few exceptions, companies working in the AI industry or offering AI services to their employees say GPT-4 is the undisputed winner in terms of capability.

Benchmarks don’t tell the whole story. These evaluations are based mainly on real-world experience. Companies have their own criteria based on their specific needs, and in pretty much every case, GPT-4 has no close competition.

It could be that Gemini’s claimed success in a wide variety of benchmarks will mean it outperforms GPT-4 in the real world. We won’t know for sure until Google’s model reaches wide distribution and is put to the test by the same companies that have found GPT-4 to be the most capable.

And where Google competitor Microsoft is reliant on OpenAI to develop new models, Google has now shown it is able to build state-of-the-art AI completely in-house. That advantage is especially important after OpenAI CEO Sam Altman was fired last month from the company under mysterious circumstances, only to be rehired after the startup came close to dissolving.

Still, by one practical measure, for instance, GPT-4 is still the undisputed winner. It can ingest about 300 pages of text in a single prompt in a measure known as the “context window.” That capability is important for use cases like legal research, where analyzing long documents is important. Gemini can only handle about one quarter as much text, according to the paper released Wednesday, though this is the first version and the context window will increase, along with other capabilities.

But for most enterprise needs, Gemini Ultra will be overkill, just like GPT-4. Most companies find they can use much smaller and less capable models, which are less expensive, with the same amount of success. That’s because for business use cases, companies are not looking for general purpose AI. They want models that zero in on data stored on corporate servers.

Today, general purpose AI models like GPT-4 and Gemini are useful for consumers. But there’s another possible customer for Gemini: Startups.

A new generation of AI startups aims to create “agents” that can take autonomous action on behalf of users. Think of them as AI personal assistants. Today, even GPT-4 is insufficient to deliver this experience. Could Gemini, with its multimodal capabilities, allow more ambitious products from AI startups?

We won’t know until startups can use it in earnest, but some of the capabilities Google showed off in demos suggests it may represent a new level in capability.

Even if Gemini is not a game changer right off the bat, it clearly represents a long-term threat to OpenAI’s dominance. When it comes to LLMs like Gemini, Google is kind of a sleeping giant awakened by ChatGPT.

Many of Google’s best minds reside within DeepMind, which has been busy on more narrow applications of AI. DeepMind’s accomplishments like AlphaFold are arguably more impactful and important than ChatGPT.

Now, DeepMind is focusing its brain power on general purpose AI models and the results are pretty stark. Gemini is on its first version and looks like it may have instantly become the industry standard. I can only imagine what Gemini 3 or 4 will look like.