Each major breakthrough in AI has occurred by removing human involvement from part of the process. Before deep learning, machine learning involved humans labeling data meticulously so that algorithms could then understand the task, deciphering patterns and making predictions. But now, deep learning obviates the need for labeling. The software can, in essence, teach itself the task.

AD

But humans have still been needed to build the architecture that told a computer how to learn. Large language models like ChatGPT came from a breakthrough in architecture known as the transformer. It was a major advance that allowed a deep learning method called neural networks to keep improving as they grew to unfathomably large sizes. Before the transformer, neural networks plateaued after reaching a certain size.

That is why Microsoft and others are spending tens of billions on AI infrastructure: It is a bet that bigger will continue to mean better.

The big downside of this kind of neural network, though, is that the transformer is imperfect. It tells the model to predict the next word in a sentence based on how groups of letters relate to one another. But there is nothing inherent in the model about the deeper meaning of those words.

AD

It is this limitation that leads to what we call hallucinations; transformer-based models don’t understand the concept of truth.

Morgan and many other AI researchers believe if there is an AI architecture that can learn concepts like truth and reasoning, it will be developed by the AI itself, and not humans. “Now, humans no longer have to describe the architecture,” he said. “They just describe the constraints of what they want.”

The trick, though, is getting the AI to take on a task that seems to exist beyond the comprehension of the human brain. The answer, he believes, has something to do with a mathematical concept known as category theory.

Increasingly popular in computer science and artificial intelligence, category theory can turn real-world concepts into mathematical formulas, which can be converted into a form of computer code. Symbolica employees, along with researchers from Google DeepMind, published a paper on the subject last month.

The idea is that category theory could be a method to instill constraints in a common language that is precise and understandable to humans and computers. Using category theory, Symbolica hopes its method will lead to AI with guardrails and rules baked in from the beginning. In contrast, foundation models based on transformer architecture require those factors to be added on later.

Morgan said it will be the key to creating AI models that are reliable and don’t hallucinate. But like OpenAI, it’s aiming big in hopes that its new approach to machine learning will lead to the holy grail: Software that knows how to reason.

Symbolica, though, is not a direct competitor to foundation model companies like OpenAI and doesn’t want to make AI models itself. Instead, it wants to sell architecture.

That is an entirely new concept in the field. For instance, Google did not view the transformer architecture as a product. In fact, it published the research so that anyone could use it.

Symbolica plans to build customized architectures for customers, which will then use them to train their own AI models. “If they give us their constraints, we can just build them an architecture that meets those constraints and we know it’s going to work,” Morgan said.

Morgan said the method will lead to interpretability, a buzzword in the AI industry these days that means the ability to understand why models act the way they do. The lack of interpretability is a major shortcoming of large language models, which are so vast that it is extremely challenging to understand how, exactly, they came up with their responses.

The limitation of Symbolica’s models, though, is that they will be more narrowly focused on specific tasks compared to generalist models like GPT-4. But Morgan said that’s a good thing.

“It doesn’t make any sense to train one model that tries to be good at everything when you could train many, tinier models for less money that are way better than GPT-4 could ever be at a specific task,” he said.