OpenAI deletes fine print on ‘military’ use of its AI technology

The change hints toward a gradual weakening of the company’s stance to work with military organizations.

ChatGPT-maker OpenAI has altered the fine print in its usage policies to eliminate specific text related to the usage of its AI technology or large language models for “military and warfare.”

Before the alteration was done on January 10, the usage policy specifically disallowed the use of OpenAI models for weapons development, military and warfare, and content that promotes, encourages, or depicts acts of self-harm. 

OpenAI said that the updated policies summarize the list and make the document more “readable” while offering “service-specific guidance”.

The list has been now condensed into what the company terms Universal Policies, which disallow anyone to use its services to bring harm to others and ban the repurposing or distribution of any output from its models to bring harm to others.

“Our policy does not allow our tools to be used to harm people, develop weapons, for communications surveillance, or to injure others or destroy property,” an OpenAI spokesperson said. “There are, however, national security use cases that align with our mission. For example, we are already working with DARPA to spur the creation of new cybersecurity tools to secure open source software that critical infrastructure and industry depend on. It was not clear whether these beneficial use cases would have been allowed under ‘military’ in our previous policies. So the goal with our policy update is to provide clarity and the ability to have these discussions.”

While the alternation of the policies is being read as the company’s gradual weakening stance against working with defense or military-related organizations, the “frontier risks” posed by AI have already been highlighted by several experts, including OpenAI CEO Sam Altman.

Highlighting risks posed by AI

In May last year, hundreds of tech industry leaders, academics, and other public figures signed an open letter warning that AI evolution could lead to an extinction event, saying that controlling the tech should be a top global priority.

“Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war,” read the statement published by the San Francisco-based Center for AI Safety.

Ironically, the most prominent signatories at the top of the letter included Altman and Microsoft CTO Kevin Scott. Executives, engineers, and scientists from Google’s AI research lab, DeepMind also signed the letter.

The first letter against the usage of AI came in March, in which more than 1,100 technology luminaries, leaders, and scientists issued a warning against labs performing large-scale experiments with AI.  

In October, OpenAI said it was readying a team to prevent what the company calls frontier AI models from starting a nuclear war and other threats.  

“We believe that frontier AI models, which will exceed the capabilities currently present in the most advanced existing models, have the potential to benefit all of humanity. But they also pose increasingly severe risks,” the company said in a blog post.

In 2017, an international group of AI and robotics experts signed an open letter to the United Nations to halt the use of autonomous weapons that threaten a “third revolution in warfare.”

These experts, again ironically, included Elon Musk, who has set up an AI firm, dubbed X.AI, to compete with OpenAI.

Reasons for concern

There could be reasons for more concern. Some researchers argue that so-called “evil” or “bad” AI models cannot be scaled back or trained to be “good” with existing techniques.

A research paper, led by Anthropic, which seeks to check if an AI system can be taught deceptive behavior or strategy, showed that such behavior can be made persistent.

“We find that such backdoored behavior can be made persistent, so that it is not removed by standard safety training techniques, including supervised fine-tuning, reinforcement learning, and adversarial training (eliciting unsafe behavior and then training to remove it),” the researchers wrote.

“Our results suggest that, once a model exhibits deceptive behavior, standard techniques could fail to remove such deception and create a false impression of safety,” they added.

According to the researchers, what is even more concerning is that the use of adversarial training to stop such deceptive behavior of models can teach them to recognize their backdoor trigger better, effectively hiding unsafe behavior.

(The story has been updated with comments from an OpenAI spokesperson.)

Anirban Ghoshal is a senior writer, covering enterprise software for CIO and databases and cloud infrastructure for InfoWorld.

Copyright © 2024 IDG Communications, Inc.

Read More

Anirban Ghoshal