The future of foundation models is closed-source

The walled garden sure looks nice.

Two seemingly contradictory but equally popular narratives about the future of foundation models have taken hold. In one future, AI centralizes: scaling laws will hold, and value accrues primarily to scaled, closed-source players. In the other future, AI decentralizes: foundation models have no moat, open-source has caught up to closed-source, and we’ll have many competing models.

Today, both narratives seem true. We have powerful closed models, and a thriving ecosystem of sacrosanct open-source models. Llama-3 recently put open-source on the map of GPT-4 class models. Meanwhile, an unusual open-source alliance has formed among developers who want handouts, academics who embrace publishing culture, libertarians who fear centralized speech control and regulatory capture, Elon who doesn’t want his nemesis to win AI, and Zuck who doesn’t want to be beholden to yet another tech platform.

As an accelerant of modern software, open-source maintains a cherished place in tech. Who can argue with free stuff, decentralized control, and free speech? But open and closed-source AI cannot both dominate in the limit: if centralizing forces hold, scale advantages will compound and leave open-source alternatives behind.

Despite recent progress and endless cheerleading, open-source AI will become a financial drain for model builders, an inferior option for developers and consumers, and a risk to national security. Closed-source models will create far more economic and consumer value over the next decade.

Open-source software started as an act of charity – the world owes the likes of Linus Torvalds and Fabrice Bellard for endowing humanity with Linux, Git, and FFmpeg. Because free stuff is popular, open-source became a great freemium marketing strategy (think Databricks or Mistral), and sometimes a market equilibrium in itself (Android e.g. is a cheap smartphone option and reinforces Google’s search monopoly).

Companies that earned free marketing from open-source eventually succumbed to business physics: Red Hat hid CentOS behind a subscription service, ElasticSearch changed their licensing after accidentally seeding competition, and Databricks owns the IP that accelerates Apache Spark.

Unlike the charity work of open-source in the early software era, today it is subsidized by businesses with their own goals. Given Meta is the primary deep-pocketed large open-source model builder, open-source AI has become synonymous with Meta AI. So the operative question for open-source AI is, what game is Meta playing? In a recent podcast, Zuckerberg explains Meta’s open-source strategy:

  1. He was burned by Apple’s closedness for the past 2 decades, and doesn’t want to suffer the same fate with the next platform shift. It’s a safer bet to commoditize your complements.

  2. He likes building cool products, and cheap, performant AI enhances Facebook and Instagram. There is some call option value if AI assistants become the next platform.

  3. He bought hundreds of thousands of H100s for improving social feed algorithms across products, and this seems like a good way to use the extras.

That all makes sense, and Llama has been great developer marketing for Facebook. But Zuck also suggests several times that there’s some point at which open-source AI no longer makes sense, either from a cost or safety perspective. When asked whether Meta will open-source the future $10b model, the answer is “as long as it’s helping us”. At some point, they’ll shift their focus from charity to profit.

Unlike other model providers, Meta is not in the business of selling model access via API. So while they’ll open-source as long as it is convenient for them, developers are on their own for model improvements thereafter.

That begs the question: if Meta is only pursuing open-source insofar as it benefits themselves, what is the tipping point at which Meta stops open-sourcing their AI? Sooner than you think:

  • Exponential data: Frontier models were trained on the corpus of the internet, but that data source is a commodity – model differentiation over the next decade will come from proprietary data, both via model usage and private data sources.

    Open-source models have no feedback loop between production usage and model training, so they foot the bill for all incremental training data, whereas closed-source models drive compounding value with data from incremental usage. If Meta differentiates their model based on their social graph or user feedback, they’ll want to capture that value via their closed products, and not share it with the world.

  • Exponential capex: A lagging-edge model that requires just a few percent of Meta’s $40b in capex is easy to open-source, and nobody will ask questions. But once you reach ten billion dollars or more in capex spend for model training, shareholders will want clear ROI on that spend (the Metaverse raised some question marks at a certain scale, too).

  • Diminishing returns on model quality within Meta: There is a large upfront benefit for Meta building an open-sourced AI model, even if it’s worse than the frontier closed-source counterpart. There are lots of small AI workloads (think feed algorithms, recommendations, and image generation) where Meta doesn’t want to rely on a third party provider like they had to rely on Apple.

    But it’s unclear whether Facebook products benefit much from models approaching AGI quality. It’s equally possible that Meta’s model improvements will be very particular to their own internal use cases. And this is where things aren’t aligned with users: if the ROI of generalized, frontier models doesn’t make sense for Meta’s products, they certainly won’t build them for the open-source community.

Zuck is not running a charity, he’s a savvy capitalist. While Meta can justify scaling capex on incremental models for their own ends, their open-source strategy will only make less sense over time.

As a developer choosing an open-source model, what do you get in terms of cost, model quality, and data security?

Cost: Open-source models have the illusion of being free. But developers bear the inference costs, which are often more expensive than comparable LLM API calls: either pay a middleman to manage GPUs and host models, or pay the direct costs of GPU depreciation, electricity, and downtime. Only large enterprise scale can amortize these fixed costs; in other categories like cloud infrastructure, even the largest F500 companies use third party cloud hosting like AWS and Azure. Unoptimized GPU spend punishes you for diseconomies of subscale (the inverse of economies of scale).

A certain type of cost-conscious enterprise or consumer will put up with weaker products and paywalls; they want pure cost optimization. But the closed-source cost curve is still coming down radically, so it’s not even clear that open-source can be cheaper in the medium term. Dot-com companies used to spend half of their budget buying server racks, until AWS fixed the cloud capex problem; closed-source model providers do the same for AI.

In capitalist America, free is never really free, so you should wonder how you’ll ultimately be monetized. This isn’t Linux where a single developer built the product as a gift to humanity; these are cash-incinerating endeavors whose only way out is to eventually monetize you. You’re probably committing to a closed-source complement in time. Every open-source company rolls out a paid tier eventually; even Android eventually monetized via Google Play and search.

Even if self-hosted open-source models are marginally cheaper above a certain breakeven point, marginal cost optimization is the wrong focus at this stage in the cycle: for most applications, capabilities are holding back adoption, not price.

Model quality: Like housing, healthcare, and education, the paid version is generally better than the free version. Even within software, the open-source winner is rarely the best product: Android is worse than iOS, OpenOffice is worse than Office or Google Docs, Godot is worse than Unity, FreeCAD is worse than SolidWorks. A corollary is that engineers focused on the best platforms make more money; they’re more likely building cutting-edge products.

Everyone is celebrating that Llama-3 is on par with GPT-4, a year later. The product quality gap between iOS and Android, or MacOS and consumer Linux, has stayed wide for a long time, because the best software creators are aligned with paying customers. When you choose closed-source models, you’re not making a point-in-time decision on model quality; you’re paying for future model improvements, where the roadmap is aligned with paying customers.

Most people focus on the last war (GPT-4) and not the next. So while open-source models are a healthy part of the ecosystem, they’re largely backwards looking. I don’t expect a capabilities plateau until the capex spend on GPUs and data reaches the tens of billions, on par with semiconductor manufacturing. Will the key open-source model builders find enough revenue to justify spending that much?

Data security: Some enterprises need the utmost data security: financial services, healthcare, legal. But I’m not sure using open-source models on-prem or via third-party cloud hosting is actually safer than using third party LLMs in the cloud; this is a legacy belief from the early internet era where an on-premise data center was the Fort Knox of data security.

As a customer, I’d trust Microsoft with healthcare data security more than my IT department’s self-managed data center. And that bridge has already been crossed: when 65% of the risk-averse Fortune 500 already uses Azure OpenAI, it makes you wonder who is dealing with data that is too sensitive for cloud-based LLMs.

Even if it makes eventual economic sense for model builders to build open-source, should they? Advocates like Yann LeCun claim that open-sourced AI is safer than closed. It makes me wonder if he really believes in Meta’s AI capabilities. Any reasonable extrapolation of capabilities with more compute, data, and autonomous tool use is self-evidently dangerous.

Appealing to American security may seem overwrought, but the past five years of geopolitics has confirmed that not everyone is on the same team. Every country outside America has an interest in undermining our closed-source model providers: Europe doesn’t want the US winning yet another big tech wave, China wants free model weights to train their own frontier models, rogue states want to use unfiltered and untraceable AI to fuel their militaristic and economic interests.

AI is a technology of hegemony. Even though open-source models are lagging behind the frontier, we shouldn’t export our technological secrets to the world for free. We’ve already recognized the national security risk in other parts of the supply chain via export bans in lithography and semiconductors. When open-source model builders release the model weights, they arm our military adversaries and economic competitors. If I were the CCP or a terrorist group, I’d be generously funding the open-source AI campaign.

A common retort is that the CCP has their own frontier AI models that are comparable to the US. But they’re still behind, and if these technologies follow a Moore’s Law pattern where capabilities compound, it’s increasingly difficult to get to the frontier if you’re not already there.

Looking at semiconductors as a historical analogy, nobody has been able to catch up to the frontier of semiconductors outside of TSMC + Nvidia, despite decades of trying. Russia is decades behind the semiconductor frontier, and even China’s SMIC is half a decade behind TSMC.

Language models are in their infancy, but already instructive towards cyberattack generation, bioweapons research, and bomb assembly. Yes, Google indexes dangerous information too, but the automation of LLMs is what makes it dangerous: a rogue model doesn’t simply explain a cyberattack concept like in a Google result, but instead can write the code, test it, and deploy it at scale – short-circuiting an otherwise arduous criminal activity.

A common open-source claim is that decentralized control of the model is better than trusting a central party. But that’s a luxury US belief where the perceived downside is limited to content moderation and corporate greed. This is a much more consequential technology: LLMs will be used to attack critical infrastructure in the West, fuel disinformation campaigns during critical elections, power cyber attacks during war time, and commit fraud. These bad actors are, and will be, empowered by open-source models.

New technologies have a long history of upsetting the balance of power in the world. As a technology with militaristic ramifications, AI should not be treated lightly.

There are varying degrees of open-source AI optimism: many simply argue that open-source is good for humanity, others claim that building open-source models is a good business strategy, and on the extreme end, and some believe that leading model providers should be forced to open-source their technology. But all three camps are dubious: open-source models empower our adversaries, and will reach increasingly negative ROI for model builders and developers.

I admit I have an allergic reaction when many open-source advocates expose their socialist tendencies from Europe, academia, or both. While early AI undoubtedly benefited from research that was openly published before its commercial value was fully recognized, academia seems ill-suited to drive frontier research going forward. Stanford’s NLP lab has only 64 GPUs, and even Fei-Fei Li admits that academia is “falling off a cliff” relative to industry.

America’s tech success is subject to endless criticism from those who missed out, but we handily won the last tech wave because American capitalism aligns users and companies for the long term. The software frontier doesn’t move forward in the long run without relentless and sustained obsession from correctly incentivized companies. This is even more true in the context of capital intensive foundation models.

Open-source will have a home wherever smaller, less capable, and configurable models are needed – enterprise workloads, for example – but the bulk of the value creation and capture in AI will happen using frontier capabilities. The impulse to release open-source models makes sense as a free marketing strategy and a path to commoditize your complements. But open-source model providers will lose the capital expenditure war as open-source ROI continues to decline.

For companies on the verge of spending tens of billions without a clear business model, and the developers betting on that ecosystem, I wish them the best of luck. But the winning models of the next decade should and will be closed-source.

Note: I work at Founders Fund, which has invested in OpenAI and Scale.

Thanks to Divyahans Gupta, Axel Ericsson, Harry Elliott, Phil Clark, Cat Wu, Michael Solana, Joey Krug, Mohit Agarwal, Yash Patil, Koren Gilbai, Theodor Marcu, and Gustavs Zilgalvis for their thoughts and feedback on this article.

Read More

John Luttig