Hollywood’s AI Concerns Present New and Complex Challenges for Legal Eagles to Untangle

Legal Tech
Diane Garrett
19 April 2024
0

Technology has been disrupting Hollywood even longer than the L.A. district now synonymous with the entertainment business has existed. From Thomas Edison’s 1877 invention of the phonograph to the turn-of-the-19th-century introduction of the radio, followed by the VCR and the internet, businesses and business models built to serve one generation of technology have often stumbled — and sometimes toppled — when confronted with the capabilities and consumer expectations created by another.

Few new technologies, however, have been quite as foundation-shaking as generative AI. Apart from the head-snapping speed at which the technology has developed, it operates on such different principles from other information technologies as to challenge the very notion of creativity, and to defy traditional concepts of authorship, identity and intellectual property.

It’s posing a challenge to courts and legislatures as they begin to grapple with its implications — and to legal eagles intent upon protecting Hollywood IP and guarding against so-called deepfakes created with tech tools. Worries about AI replacing talent became a big stumbling block during the rare dual strikes that crippled Hollywood production last year, and AI advances have escalated in the months since SAG-AFTRA, led Fran Drescher, signed a deal with the AMPTP.

“AI can be a tool we use, but it’s still us, the writers, who are doing that work,” John August, a screenwriter and member of the WGA negotiating committee, said at a public meeting hosted by the Copyright Office. “And it’s important, as we look at the impact of AI on copyright, not to confuse the copyright holder with the author, and that we are the human authors of the work.”

Here is a primer to key legal issues surrounding AI.

As of the end of March, at least 16 lawsuits had been filed in the U.S. over the unlicensed use of copyrighted works to train generative AI models. Others have been filed over deepfakes, unfair competition and various state law claims. At least one case has been filed over whether works generated by AI are eligible for copyright.

More are certainly on the way.

Ten of the 16 copyright infringement cases were filed as putative class actions on behalf of authors or visual artists. Most allege multiple causes of action, from direct copyright infringement to contributory infringement, inducement and vicarious infringement.

Two of the most high-profile cases have been brought by single rights owners, the New York Times against OpenAI and Microsoft, and Getty Images against StabilityAI, respectively.

As of this writing, few of those cases have progressed beyond the pleadings stage. And several have been pared down by judges, eliminating most of the secondary liability claims, leaving only claims of direct infringement to go forward.

Even pared down, however, courts are likely to struggle to find precedents that clearly apply.

Unlike earlier technologies that reproduce, distribute, perform or display copyrighted works, a generative AI system’s use and reuse of text, images and sounds are purely computational.

Most generative AI systems today are built on artificial neural networks, comprised of multiple layers of interconnected processing nodes modeled very loosely on the organization of human brains. As data from a particular domain — text, music, video, images, code, etc. — is fed into the system, it gets broken down into its constituent parts, each of which gets assigned a unique numeric value.

Those numeric values then become the raw material for processing by individual nodes. Using those values and the mathematical relationships among them, the system gradually develops a highly complex mathematical model of natural language, imagery or sound. As more data is fed in, the system continuously adjusts the model until it closely resembles new input. Over time, and with enough input, the network “learns” how to generate new instances of the type of data it was trained on in response to prompts.

Fair or Unfair?

The amount of data used to train the largest models is staggering, on the order of tens of billions of images, or hundreds of billions of words. What the system actually retains from its training data, however, is not the words or images themselves but the numeric values assigned to their constituent parts and the statistical relationships among them.

Whether that process constitutes actual reproduction of the works in the training data, as plaintiffs have claimed, is as much a technical question as it is a legal one.

In their defense, the AI companies targeted with copyright infringement suits have claimed fair use, arguing that the training process is transformative of the input and therefore not infringing under prevailing legal precedents.

According to Pamela Samuelson, the Richard M. Sherman distinguished professor of law and information at UC Berkeley and a widely recognized expert on digital copyright, the biggest challenge plaintiffs in those cases face will be establishing actual — as opposed to speculative or potential — harm from the use of their works to train AI models, even if they can establish that their works were copied in the process of that training.

“Just making copies is not enough to say that there is harm,” she told an industry conference on AI in February. “There has to be some actual harm, or the likelihood of [actual] harm to an existing or likely to develop market” for the works. Promising that you were going to license the material isn’t enough either, she observed, as courts “need to see some evidence that the market was harmed.”

Samuelson rates the New York Times and Getty Images cases as the most likely to succeed or compel the defendants to settle because both companies had well-established licensing businesses in the content at issue that pre-date the rise of generative AI.

On the legislative front, the European Union’s recently enacted AI Act, the most comprehensive effort by any jurisdiction to regulate AI, will require developers of large AI systems to provide “sufficiently detailed summaries” of the data used to train their models.

That information might bolster rights owners’ infringement claims against AI companies by providing confirmation that their works were used in training a particular model. That is yet to be tested in any European court, however.

Rights owners in the U.S. have pressed for similar data transparency requirements. To date, neither Congress nor any regulatory agency has acted, although at least one bill has been introduced in the House to require disclosure of training data.

Some in the industry have taken their own steps to promote greater training-data transparency. Ed Newton-Rex led the development of StabilityAI’s music-generator model, Stable Audio. But he resigned in November over the company’s embrace of fair use (Stable Audio was trained entirely on licensed content). He has since launched Fairly Trained, a voluntary industry initiative to certify AI models that are trained without any unlicensed copyrighted works.

At Stability, “I was very aware that there were two radically different approaches to training: One that involved scraping content and claiming it falls under the fair use exception in the U.S., and the other that doesn’t make that claim and is much more respectful of the owners of the content and the creators of the content,” he tells Variety. With Fairly Trained, “We want to make that division clear.”

In March, the non-profit organization certified its first LLM, a legal-research tool developed by 273 Ventures that was trained entirely on court documents and legislative texts.

Faking It

One area where U.S. legislators have been active is the rapid proliferation of deepfakes. Video and audio deepfakes use generative AI to mimic the likeness and/or voice of an individual without their knowledge or consent, and can make it appear the person is saying, doing or singing something they’re not.

Victims of deepfakes have included high-profile celebrities, as in the fake “Drake” and “The Weeknd” track “Heart on my Sleeve” released last April, and the explicit deepfake of Taylor Swift that went viral in January. But they have also included many politicians, including President Biden, which could explain lawmakers’ eagerness to act.

Again, though, the legality or illegality of AI deepfakes is unclear. Roughly half the states have some form of right-of-publicity laws on their books, but the rules vary by state and many are addressed primarily to the use of someone’s likeness in advertising or other materials. They do not generally provide a clear cause of action for victims of AI deepfakes.

In October, the chair and ranking member of the Senate Judiciary Committee, Chris Coons (D-Del.) and Thom Tillis (R-N.C.) circulated, but did not formally introduce, a draft of the NO FAKES Act. And in January, Rep. Maria Salazar (R-Fla.) introduced the No AI FRAUD Act in the House. Both bills would create a new class of federal intellectual property in an individual’s image, likeness and voice, and would supersede state publicity laws.

Despite bipartisan concerns over the use of deepfakes to influence the coming election in November, neither bill is expected to receive a vote in the full chambers this year, given Congress’ intense focus on electoral matters. And in any case, establishing a new category of intellectual property alongside copyrights, patents and trademarks will be a very heavy legislative lift, given the many stakeholders in the issue and with courts and regulatory agencies still sifting through whether existing laws

already apply.

In February, the Federal Trade Commission issued a Notice of Proposed Rulemaking (NPRM) seeking public comments on whether to extend its existing rule against impersonation of a business or government agency to cover individuals. Unlike the congressional bills, the FTC’s rulemaking would be grounded in laws governing unfair or deceptive trade practices, rather than in intellectual property, in line with agency’s antitrust remit.

Starting in January 2023, meanwhile, the U.S. Copyright Office began a year-long study on copyright and AI. The office received more than 10,000 written comments as part of its inquiry, reflecting the keen interest in the issue among the public as well as within the technology and creative industries. The office plans to introduce a series of reports of its findings throughout 2024, including its recommendations to Congress of any possible changes to copyright law.

The first such report, slated for “late spring,” will address issues around deepfakes. Others will cover the use of copyrighted works in training AI models, and the copyrightability of works created using AI.

Made With AI

In June 2019, the computer scientist Stephen Thaler applied to the U.S. Copyright Office to register an image titled “A Recent Entrance to Paradise” that he claimed had been “autonomously created” by a generative AI model he had developed. The office rejected his application, citing its long-standing policy that copyright protection is reserved for works created by human authors.

After filing a series of petitions for reconsideration, which were also rejected, Thaler sued the Copyright Office claiming its humans-only policy was an overly narrow interpretation of the Copyright Act. In August 2023 the district court ruled in favor of the Copyright Office, citing a long line of Supreme Court precedents upholding the humans-only policy. The case is now on appeal.

The policy was also at the center of a controversy over a graphic novel titled “Zarya of the Dawn” written and illustrated by artist Kristina Kashtanova. The Copyright Office accepted her registration in the fall of 2022. In February 2023, however, after learning that the images in the novel were created using the Midjourney AI image generator, the office partially rescinded the registration. While the written text would remain protected, the AI-produced images would not be.

The case raised alarms among many artists and creators that use computer-aided tools in the normal course of their work, fearing they might lose their copyright protection. It also sparked deep concern among the Hollywood studios, who fear the rapidly expanding use of AI in film and television production could compromise their ability to obtain copyright protection in their works.

“MPA is troubled that the Office is moving toward an inflexible rule that will deny registration if human users are not able to predict and control the particular outputs that follow from prompts provided to the AI system, despite extensive human involvement in the creative process,” the Motion Picture Assn. wrote in comments submitted to the Copyright Office in its inquiry into copyright and AI. “Even if such an approach is appropriate for some uses of ‘generative AI’ systems like Midjourney, the approach should not apply to MPA’s members’ use of AI as a production and post-production tool.”

There is broad agreement among stakeholders that the Copyright Office’s current policy of parsing individual works to separate the purely human elements of a work, the purely AI elements and AI-assisted human elements for disparate treatment under copyright is unworkable, because the volume of mixed works will quickly overwhelm the system, and because the lines will keep shifting.

The office plans to introduced updated registration guidance later this year. But between now and then, a cloud will hang over all works.

Paul Sweeting is co-founder of the RightsTech Project and founder & principal of Concurrent Media Strategies.