• D.C.
  • BXL
  • Lagos
  • Riyadh
  • Beijing
  • SG
  • D.C.
  • BXL
  • Lagos
Semafor Logo
  • Riyadh
  • Beijing
  • SG


Early OpenAI investor bets on alternative Sam Altman’s approach to AI

Updated Apr 12, 2024, 2:50pm EDT
tech
Steve Jennings/Getty Images for TechCrunch
PostEmailWhatsapp
Title icon

The Scene

Venture capitalist Vinod Khosla made one of the shrewdest bets of his career in 2019, when he put $50 million into OpenAI, now the darling of the generative AI boom. In a sign of just how fast that industry is moving, Khosla recently made another big investment, this time in a company built around the belief that there is a better alternative to OpenAI’s roadmap.

Symbolica AI, co-founded by former Tesla Autopilot engineer George Morgan, is building a new AI-assisted coding tool that it says uses a new method of machine learning that works completely differently from the cutting edge foundation models made by OpenAI, Google, and other major AI companies.

With this new approach, Morgan says Symbolica’s models won’t require the same massive, power-hungry compute that companies are now spending tens of billions of dollars to procure for the most advanced AI models.

AD

In an interview with Semafor, Morgan said those investments are based on speculation that, if given enough data and enough compute resources, later versions of these models will become more intelligent.

Without mathematical proof showing how these things work, Morgan says the process is more like alchemy. “That’s exactly what AI models are today,” he said. “You mix a bunch of random stuff together, you test it, you see if it does the thing or not. If it doesn’t, you try something else. What Symbolica is doing is bringing this into the era of chemistry.”

Title icon

Know More

Each major breakthrough in AI has occurred by removing human involvement from part of the process. Before deep learning, machine learning involved humans labeling data meticulously so that algorithms could then understand the task, deciphering patterns and making predictions. But now, deep learning obviates the need for labeling. The software can, in essence, teach itself the task.

AD

But humans have still been needed to build the architecture that told a computer how to learn. Large language models like ChatGPT came from a breakthrough in architecture known as the transformer. It was a major advance that allowed a deep learning method called neural networks to keep improving as they grew to unfathomably large sizes. Before the transformer, neural networks plateaued after reaching a certain size.

That is why Microsoft and others are spending tens of billions on AI infrastructure: It is a bet that bigger will continue to mean better.

The big downside of this kind of neural network, though, is that the transformer is imperfect. It tells the model to predict the next word in a sentence based on how groups of letters relate to one another. But there is nothing inherent in the model about the deeper meaning of those words.

AD

It is this limitation that leads to what we call hallucinations; transformer-based models don’t understand the concept of truth.

Morgan and many other AI researchers believe if there is an AI architecture that can learn concepts like truth and reasoning, it will be developed by the AI itself, and not humans. “Now, humans no longer have to describe the architecture,” he said. “They just describe the constraints of what they want.”

The trick, though, is getting the AI to take on a task that seems to exist beyond the comprehension of the human brain. The answer, he believes, has something to do with a mathematical concept known as category theory.

Increasingly popular in computer science and artificial intelligence, category theory can turn real-world concepts into mathematical formulas, which can be converted into a form of computer code. Symbolica employees, along with researchers from Google DeepMind, published a paper on the subject last month.

The idea is that category theory could be a method to instill constraints in a common language that is precise and understandable to humans and computers. Using category theory, Symbolica hopes its method will lead to AI with guardrails and rules baked in from the beginning. In contrast, foundation models based on transformer architecture require those factors to be added on later.

Morgan said it will be the key to creating AI models that are reliable and don’t hallucinate. But like OpenAI, it’s aiming big in hopes that its new approach to machine learning will lead to the holy grail: Software that knows how to reason.

Symbolica, though, is not a direct competitor to foundation model companies like OpenAI and views its core product as bespoke AI architectures that can be used to build AI models for customers.

That is an entirely new concept in the field. For instance, Google did not view the transformer architecture as a product. In fact, it published the research so that anyone could use it.

Symbolica plans to build customized architectures for customers, which will then use them to train their own AI models. “If they give us their constraints, we can just build them an architecture that meets those constraints and we know it’s going to work,” Morgan said.

Morgan said the method will lead to interpretability, a buzzword in the AI industry these days that means the ability to understand why models act the way they do. The lack of interpretability is a major shortcoming of large language models, which are so vast that it is extremely challenging to understand how, exactly, they came up with their responses.

The limitation of Symbolica’s models, though, is that they will be more narrowly focused on specific tasks compared to generalist models like GPT-4. But Morgan said that’s a good thing.

“It doesn’t make any sense to train one model that tries to be good at everything when you could train many, tinier models for less money that are way better than GPT-4 could ever be at a specific task,” he said.

(Correction: An earlier version of this article incorrectly said that some Symbolica employees had worked at Google DeepMind.)

Title icon

Reed’s view

I haven’t seen Symbolica’s technology yet, so I can’t vouch for its capabilities. But tech leaders have told me that the next big breakthrough in AI is likely removing humans from the architecture step of its development.

There is a big market for what Symbolica is trying to build. While consumers may want to chat with AIs about any topic, businesses want the opposite, with language interfaces that are narrowly focused and totally reliable. Hallucination, in many enterprise contexts, is simply unacceptable.

In the near term, the large transformer models will get bigger and more capable. According to people who have used the still under wraps GPT-5, the next generation of OpenAI’s technology, it is much closer to reasoning abilities than GPT-4. It still hallucinates and is definitely not AGI, but it sounds like it is good enough that it will be more useful to a wider swath of customers.

Ten years from now, we may look back and realize that scaling transformer-based models only got us so far, and that new methods like Symbolica’s represented the path to AI with reasoning capabilities.

Even if that’s the case, today’s foundation models will have played a critical role. ChatGPT and the resulting AI craze has inspired a wave of investment and talent pouring into the field.

Title icon

Room for Disagreement

Ilya Sutskever, an OpenAI cofounder and its chief scientist, said in a podcast last year that he believes there’s no question that scaling up the transformer architecture is the path toward “artificial general intelligence.” He said the transformer represents the “one big uniform architecture” that underpins intelligence, similarly to how the brain structure of humans and other animals are alike. “The best way to think about the question of architecture is not in terms of a binary ‘is it enough’ but ‘how much effort, what will be the cost of using this particular architecture?’.”

“At this point I don’t think anyone doubts that the transformer architecture can do amazing things, but maybe something else, maybe some modification, could have some computer efficiency benefits. So better to think about it in terms of compute efficiency rather than in terms of if it can get there at all.”

Title icon

Notable

  • If you want to geek out on category theory and how it might be used in machine learning, here’s a relatively not-too-dense treatise on Medium that you might enjoy.
AD