Misconceptions about SB 1047

This February, California state senator Scott Wiener (D – San Francisco) introduced Senate Bill 1047, which, if passed, would require the companies behind the world’s largest, most advanced AI models to take steps to guarantee their safety before releasing the models to the public. For a piece of state legislation, the bill has unusual international importance. Most of the companies making such models are headquartered in California. 

Despite write-ups in the San Francisco Chronicle and Washington Post, the bill initially received relatively little attention. That was until last week, when — as these things go — a spate of strong objections to it exploded across Twitter and Substack.

Some of these objections were helpful critiques pointing to potential problems with SB 1047’s current form. Some were based on a failure to understand how the law works, a failure to carefully read the bill, or both. And some were alarmist rhetoric with little tether to the bill as it’s actually written. 

Nothing is a substitute for reading the actual bill. The purpose of this piece, a condensed version
of the one I published on my blog, is to respond to some of the most serious misconceptions — and to suggest concrete changes that address some of the real concerns.

So what does the bill actually do?

SB 1047 only applies to the most powerful new AI models, which it defines as “covered models.” Let’s say you’d like to build a new model. What counts?

Covered models are those trained on more than 10²⁶ flops (a measure of computing power that, at current prices, is estimated to cost between tens and hundreds of millions of dollars), or projected to have similar performance on benchmarks used to evaluate state of the art foundation models. If your model was trained on less than 10²⁶ flops and doesn’t outperform those that were? It is not a covered model. You do not need to do anything at all. Note that this 10²⁶ threshold would, according to our best estimates, exclude every released model including GPT-4, Claude Opus and the current versions of Google Gemini. None of them would count as covered models under this bill.

What if your model is a derivative of an existing model, e.g. of an open-source model? You also do not need to do anything. Right now, no current open-source models are anywhere near the threshold, other than Meta’s prospective Llama-3 400B, which may or may not hit it. But even if Llama-3 400B is covered, developers who use or modify it would be unaffected. All the requirements in the bill fall on the original developer — in this case, Meta. 

Now let’s say you’ve secured access to enough computing power to meet the compute threshold, or you can be reasonably sure your model will be 2024 state of the art. Then you are training a covered model, and you will need to adhere to the safety requirements laid out in the bill.

During training, you will need to a) ensure no one can hack it (while it remains in your possession), b) make sure it can be shut down, c) implement covered guidance (here meaning guidance issued by the National Institute of Standards and Technology and by the newly created Frontier Model Division, as well as “industry best practices”), and d) implement a written and separate safety and security protocol which can provide reasonable assurance that the model lacks hazardous capabilities, or — if it has them — that it can’t use them. You will also need to include the testing procedure you will use to identify the hazardous capabilities — and say what you will do if you find them. Notably, the bill doesn’t specify what any of this looks like. Developers create and implement the plans; the government does not dictate what they are.

Hazardous capabilities are set at an extremely high threshold. We are not talking about hallucinations, bias, or Gemini generating images of diverse senators from the 1800s — or even phishing attacks, scams, or other serious felonies. The bill specifies hazardous capabilities as the ability to directly enable a) the creation or use of weapons of mass destruction; b) at least $500 million of damage through cyberattacks on critical infrastructure via a single incident or multiple related incidents; c) the same amount of damage, performed autonomously, in conduct that would violate the Penal Code; or d) other threats to public safety of comparable severity. 

You train your covered model and now you want to release it. What then? You must implement “reasonable safeguards and requirements” to prevent anyone from being able to use its hazardous capabilities, if it has them. After deployment, you need to periodically reevaluate your safety protocols, and file an annual report that says you’re doing so. 

If you don’t want to comply with these requirements, you can apply for a limited duty exemption. 

You have two ways to do this. One is for you, the developer, to provide reasonable assurance that your model (and any derivative models based on it) won’t have hazardous capabilities. Or you can show that your model will be no more capable than existing noncovered models which themselves lack hazardous capabilities, or which have their own limited duty exemption. Meeting either requirement allows you to train and release your model without additional safeguards. 

The point of the limited duty exemption is to reduce the regulatory burden on future, more capable models: in the coming years, we expect to learn more about which classes of models are safe. The limited duty exemption would remove most safety requirements for developers replicating models that are already proven safe. Let’s say it’s clear in 2025 that state of the art covered models from 2024 don’t have hazardous capabilities — these models, and others based on them, could get an exemption and be released without safeguards. 

Does anyone need to get a limited duty exemption? No. Labs are entirely free to deploy models that don’t qualify for an exemption, as long as they still comply with safety requirements. 

Major misconceptions

Is this an existential threat to California’s AI industry? 

No. The bill has zero or minimal impact on most of California’s AI industry, and this is unlikely to change for years. Few companies will want to train covered models that attempt to compete with Google, Anthropic and OpenAI. There’s no requirement for developers of noncovered models to do anything at all — even fill out paperwork. The bill relies entirely on AI labs to self-report if their models are large enough to qualify. For those who do train covered models, there will be increasing ability to get limited duty exemptions that make the requirements trivial.

Even for leading labs, the compliance costs imposed by the bill are trivial compared to the compute costs of training the model. As a practical matter, I believe that I could give reasonable assurance, right now, that all publically available models (including GPT-4, Claude 3, and Gemini Advanced 1.0 and Pro 1.5) lack hazardous capability. 

The people who made them think so too. While the safety requirements in the bill are more stringent, and their final details have yet to be determined, they’re not drastically different from what industry leaders already do. When GPT-4 was released, OpenAI tested its ability to contribute to the creation of biological, nuclear, and chemical weapons, and its ability to carry out cyberattacks. They also partnered with METR (Model Evaluation and Threat Research), a nonprofit team at the forefront of assessing whether AI systems pose catastrophic risks, to test if the model was capable of “escaping” its datacenter and copying itself into the wild (it wasn’t). Anthropic regularly tests their models for biological and cyber capabilities as well as autonomous replication. Both labs have a Responsible Scaling Policy (Anthropic) or a Preparedness Framework (OpenAI) that details their plans for testing their models and what level of risk would make them stop.
 

These measures are already the most rigorous in the business, and they clearly haven’t destroyed either lab’s ability to make leading-edge models.

Does the bill create a new regulatory agency? 

No. It creates the Frontier Model Division within the California Department of Technology.
The new division will issue guidance, allow coordination on safety procedures, appoint an advisory committee on (and to assist) open source, publish incident reports and process certifications. That’s all. 

Some critics of SB 1047 have greatly exaggerated the powers of the FMD. In an impact analysis, Brian Chau, executive director of the advocacy group Alliance for the Future, writes that the bill “concentrates power (even military power) in a small, minimally accountable Frontier Model Division” and that the FMD would be “a new regulatory entity in California.” A call to action circulated by AFTF states that it would have “police powers,” and would be able to “throw model developers in jail for the thoughtcrime of doing AI research.” 

All of this is — to put it mildly —  false. SB 1047 will be enforced by the state Attorney General, not the FMD. The FMD has no power to enforce regulations, order arrests, or block the release of new models. And neither the Attorney General nor anyone else can throw people in jail for violating the bill (unless they intentionally lie to the government in writing), let alone wield military power. 


Are the burdens here overly onerous to small developers, researchers, or academics? 

Right now rather obviously not, since the burdens do not apply to them. The substantial burdens only apply if you train a covered model, from scratch, that can’t get a limited duty exemption. A derivative model never counts. A ChatGPT wrapper definitely does not count. Research on any existing models does not count. 

This will not apply to a small developer for years. At the point that it does, yes, if you make a GPT-5-level model from scratch, I think you can owe us some reports.

Which models are covered depends not just on size but on performance — models with similar capabilities to what one could have trained with 10²⁶ flops in 2024 would also need to comply with safety regulations. Neil Chilson, Head of AI Policy at the newly formed Abundance Institute, argues that this clause is anti-competitive, with its purpose being to ensure that if someone creates a smaller model that has similar performance to the big boys, it would not have cheaper compliance costs.

No. The point is to ensure the safety of models with advanced capabilities. The reason to use a 10²⁶ flops threshold is that this is the best approximation we have for “likely to have sufficiently advanced capabilities.”

Are regulatory requirements capable of contributing to moats? Yes, of course. And it is possible this will happen here to a non-trivial degree, among those training frontier foundation models in particular. But — as I say above —  I expect the costs involved to be a small fraction of the compute costs of training such models, or the cost of actual necessary safety checks. 

Finally, does the bill, as claimed by (again) Brian Chau “‘literally specify that they want to regulate models capable of competing with OpenAI?”

No, of course not. To the people spreading this claim: You can do better. You need to do better.

There are legitimate reasons one could think this bill would be a net negative even if its particular implementation issues are fixed. There are also particular details that need (or at least would benefit from) fixing. Healthy debate is good.

This kind of hyperbole, and a willingness to repeatedly signal boost it, is not.


Does SB 1047 target open source AI? 

Dean Ball, a research fellow at George Mason University’s Mercatus Center, claims that the bill would “effectively outlaw all new open source AI models.” 

It won’t. No existing open source model would count as a covered model. Maybe some will in the future. But this very clearly does not “ban all open source.” There are zero existing open weights models that this bans. There are a handful of open source developers that might plausibly have obligations under this bill in the future, but we’re talking about companies like Meta and perhaps Mistral, not small start-ups. 

Another set of concerns involves the bill’s shutdown requirement: developers of covered models must be able to shut down all copies of the model on computers that they control. This provision has been misread to refer to all copies in existence. This would of course make releasing a model’s weights illegal, since it’s impossible for a company to shut down every copy of a model downloaded by anyone anywhere. However, that’s not what the law says. Emphasis mine: 

22602 (m): “Full shutdown” means the cessation of operation of a covered model, including all copies and derivative models, on all computers and storage devices within custody, control, or possession of a person, including any computer or storage device remotely provided by agreement.

If they had meant “full shutdown” to mean “no copies of the model are now running” then this would not be talking about custody, control or possession at all. Instead, if the model has open weights and has been downloaded by others, the developer is off the hook.

Rather than a clause that is impossible for an open model to meet, this is a clause where open models are granted extremely important special treatment, in a way that seems damaging to the core needs of the bill.

Are developers guilty of perjury for being wrong? 

Other critics have claimed that the bill would jail developers for making good-faith mistakes about the capabilities of their models. Not unless they are willfully defying the rules and outright lying, in writing, to the government.

Even then, mostly no. It is extremely unlikely that perjury charges would be pursued unless there was clear bad faith and lying, and even if that happened, it still seems unlikely. California’s perjury statute requires you to know your statement to be false. There were only a handful of prosecutions in all of 2022. People almost never get prosecuted for perjury. 

What are the real problems?

None of this is to say that SB 1047 is perfect. There are two known implementation problems with the bill as written. 

The most serious is that the bill’s definition of “derivative models” is too broad. As written, derivative models can include unlimited additional training. This creates a loophole: it’s possible to take a less capable base model and make it significantly more capable without being subject to any safety requirements. In that situation, responsibility would still rest with the original developer of the base model. 

This could be fixed by adding a section to make clear that if a sufficiently large amount of compute (I suggest 25% of original training compute or 10²⁶ flops, whichever is lower) is spent on additional training and fine-tuning of an existing model, then the resulting model is now nonderivative. The new developer has all the responsibilities of a covered model, and the old developer is no longer responsible.

The other problem is that, in determining if a model empowered a human to cause catastrophic harm, the bill compares an AI-assisted human to someone who lacks access to any covered models at all. But this isn’t realistic: over time, access to covered models will become more widespread and may make essentially every complex task substantially easier. This can be fixed by changing the baseline for comparison to what a human can do with access to covered models established as safe. 

Conclusion

Hopefully this has cleared up a lot of misconceptions about SB 1047.

California AI companies are investing billions of dollars in AI, and talking about making that trillions. Policymakers have been caught repeatedly off guard by the capabilities of the models they develop. SB 1047 is an admirable effort to get ahead of the ball, and make sure companies that spend tens or hundreds of millions of dollars on a new model are checking if their models can commit catastrophic large-scale crimes.

This bill would have zero impact on every model currently available outside the three big labs of Anthropic, OpenAI and Google DeepMind, and at most one other model known to be in training, Llama-3 400B. If you build a “derivative” model, meaning you are working off of someone else’s foundation model, you have to do nothing.

In addition, if in the future you build something that is substantially above GPT-4 level, matching the best anyone will do in 2024, then so long as you are behind existing state of the art your requirements will be minimal.

This bill is not all upside, nor is it in ideal condition. But with some changes, it seems to be a mostly excellent version of what it is attempting to be. 

That does not mean it could not be improved further. It certainly does not mean we will not want to make changes over time as AI rapidly evolves, nor that it would be sufficient if passed in identical form at the federal level. For all the misguided talk of how this bill would supposedly destroy the entire AI industry in California, it is easy to see ways the bill could prove inadequate to future safety needs. What this does seem to be is a good baseline from which to encourage and gain visibility on basic safety precautions, which puts us in a better position to assess future unpredictable situations.

There is room — and we should make room — for good faith disagreements over SB 1047 and what effect it will have in practice. But those disagreements should be based on what the bill actually says and does. 

Read More

Margarete Ramage