AI’s Smarts Now Come With a Big Price Tag

Calvin Qi, who works at a search startup called Glean, would love to use the latest artificial intelligence algorithms to improve his company’s products.

Glean provides tools for searching through applications like Gmail, Slack, and Salesforce. Qi says new AI techniques for parsing language would help Glean’s customers unearth the right file or conversation a lot faster.

But training such a cutting-edge AI algorithm costs several million dollars. So Glean uses smaller, less capable AI models that can’t extract as much meaning from text.

“It is hard for smaller places with smaller budgets to get the same level of results” as companies like Google or Amazon, Qi says. The most powerful AI models are “out of the question,” he says.

AI has spawned exciting breakthroughs in the past decade—programs that can beat humans at complex games, steer cars through city streets under certain conditions, respond to spoken commands, and write coherent text based on a short prompt. Writing in particular relies on recent advances in computers’ ability to parse and manipulate language.

Those advances are largely the result of feeding the algorithms more text as examples to learn from, and giving them more chips with which to digest it. And that costs money.

Consider OpenAI’s language model GPT-3, a large, mathematically simulated neural network that was fed reams of text scraped from the web. GPT-3 can find statistical patterns that predict, with striking coherence, which words should follow others. Out of the box, GPT-3 is significantly better than previous AI models at tasks such as answering questions, summarizing text, and correcting grammatical errors. By one measure, it is 1,000 times more capable than its predecessor, GPT-2. But training GPT-3 cost, by some estimates, almost $5 million.

“If GPT-3 were accessible and cheap, it would totally supercharge our search engine,” Qi says. “That would be really, really powerful.”

The spiraling cost of training advanced AI is also a problem for established companies looking to build their AI capabilities.

Dan McCreary leads a team within one division of Optum, a health IT company, that uses language models to analyze transcripts of calls in order to identify higher-risk patients or recommend referrals. He says even training a language model that is one-thousandth the size of GPT-3 can quickly eat up the team’s budget. Models need to be trained for specific tasks and can cost more than $50,000, paid to cloud computing companies to rent their computers and programs.

McCreary says cloud computing providers have little reason to lower the cost. “We cannot trust that cloud providers are working to lower the costs for us building our AI models,” he says. He is looking into buying specialized chips designed to speed up AI training.

Part of why AI has progressed so rapidly recently is because many academic labs and startups could download and use the newest ideas and techniques. Algorithms that produced breakthroughs in image processing, for instance, emerged from academic labs and were developed using off-the-shelf hardware and openly shared data sets.

Over time, though, it has become increasingly clear that progress in AI is tied to an exponential increase in the underlying computer power.

Big companies have, of course, always had advantages in terms of budget, scale, and reach. And large amounts of computer power are table stakes in industries like drug discovery.

Now, some are pushing to scale things up further still. Microsoft said this week that, with Nvidia, it had built a language model more than twice as large as GPT-3. Researchers in China say they’ve built a language model that is four times larger than that.

“The cost of training AI is absolutely going up,” says David Kanter, executive director of MLCommons, an organization that tracks the performance of chips designed for AI. The idea that larger models can unlock valuable new capabilities can be seen in many areas of the tech industry, he says. It may explain why Tesla is designing its own chips just to train AI models for autonomous driving.

Some worry that the rising cost of tapping the latest and greatest tech could slow the pace of innovation by reserving it for the biggest companies, and those that lease their tools.

“I think it does cut down innovation,” says Chris Manning, a Stanford professor who specializes in AI and language. “When we have only a handful of places where people can play with the innards of these models of that scale, that has to massively reduce the amount of creative exploration that happens.”

Read More

Will Knight