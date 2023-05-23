Scott is an anomaly in the tech industry. He grew up poor in a small Virginia town not known for producing C-suite technology executives, and completed most of his PhD in computer science from the University of Virginia.

In his spare time, he constructs leather bags, backpacks — a lot of them, from the looks of his Instagram page — and guitar picks and tools. His Bay Area workshop, which includes laser cutters, drill presses, and industrial sewing machines, is legendary in Silicon Valley.

On paper, he seemed like an unlikely candidate to make Microsoft the leader in artificial intelligence. The field is full of PhDs from MIT, Stanford, and Carnegie Mellon.

“The nice thing about Kevin is he's not bothered with any of that,” said Mike Volpi, a venture capitalist and a friend of Scott’s. “He doesn't seem to need the reinforcement. He's sort of independent of what I would characterize as the mainstream, accepted way of doing things, which lets him do stuff like what he did at Microsoft.”

Pinterest’s head of engineering, Jeremy King, who has been having breakfast with Scott for years, said his friend could see what would happen with AI far ahead of anyone else.

“He's always just a great guy to bounce ideas off of like, ‘hey, we're going this way, we're thinking about combining these two things together.’ Most of the time, he's already thought about it or knew people who thought about it,” King said.

Scott was an engineer at Google before joining LinkedIn. By the time LinkedIn was acquired by Microsoft in 2017, Scott was senior VP of engineering and operations. After the deal, Microsoft CEO Satya Nadella named Scott CTO of the parent company.

At the time of the acquisition, hype around AI had turned to disillusionment. While the technology was everywhere, used to automate processes in just about every industry, consumer applications like virtual assistants and chatbots had failed miserably. Even autonomous driving, once considered right around the corner, had turned into a far-off goal.

But soon, OpenAI was about to make a big bet on “transformer models,” then a new kind of artificial intelligence technique that was eventually used to power the company’s ChatGPT and DALL-E products.

In the past, AI models always plateaued in capability after they reached a certain size. In theory, these transformer models would break the mold and keep “learning” with more and more data.

But that was, by definition, a hypothesis. Testing it was a big technical challenge that involved spending huge sums of money. Under Scott’s direction, Microsoft built a supercomputer with 285,000 central processing unit cores and 10,000 GPUs.

Training OpenAI’s models was ultimately successful, but the journey was not a linear one, Scott said. At times, it would look as if progress was slowing down or stopping completely.

“You think those thoughts because there are a whole bunch of people saying those things all along,” he said. “Whenever you're making a heavy bet, you'll have a full spectrum of people. Some that just completely don't believe that this is a reasonable thing to do or that it's technically flawed, or a dumb allocation of capital.”

In training large language models, it’s difficult to predict exactly when improvements will occur. As the models get bigger, new abilities emerge and then suddenly disappear as the models get even bigger. Then they can reappear later on. Researchers aren’t quite sure why.

Scott said he drew on his experience to steel his resolve. “When you don't have the experience, you've never seen one of these cycles all the way before, it can be really, really disconcerting,” he said. “All of that fear and anxiety tends to make people super cautious, which is exactly the opposite of what you have to do when you're trying to make something very big happen.”