ChatGPT-maker braces for fight with New York Times and authors on 'fair use' of copyrighted works

FILE - The OpenAI logo is seen on a mobile phone in front of a computer screen displaying output from ChatGPT, March 21, 2023, in Boston. A barrage of high-profile lawsuits in a New York federal court, including one by the New York Times, will test the future of ChatGPT and other artificial intelligence products. (AP Photo/Michael Dwyer, File)

A barrage of high-profile lawsuits in a New York federal court will test the future of ChatGPT and other artificial intelligence products that wouldn't be so eloquent had they not ingested huge troves of copyrighted human works.

But are AI chatbots 鈥 in this case, widely commercialized products made by OpenAI and its business partner Microsoft 鈥 breaking copyright and fair competition laws? Professional writers and media outlets will face a difficult fight to win that argument in court.

鈥淚 would like to be optimistic on behalf of the authors, but I鈥檓 not. I just think they have an uphill battle here,鈥 said copyright attorney Ashima Aggarwal, who used to work for academic publishing giant John Wiley & Sons.

from The New York Times. of well-known novelists such as John Grisham, Jodi Picoult and George R.R. Martin. A third from bestselling nonfiction writers, including an author of the Pulitzer Prize-winning biography on which the hit movie .

THE LAWSUITS

Each of the lawsuits makes different allegations, but they all center on the San Francisco-based company OpenAI 鈥渂uilding this product on the back of other peoples鈥 intellectual property,鈥 said attorney Justin Nelson, who is representing the nonfiction writers and whose law firm is also representing The Times.

鈥淲hat OpenAI is saying is that they have a free ride to take anybody else鈥檚 intellectual property really since the dawn of time, as long as it鈥檚 been on the internet,鈥 Nelson said.

The Times sued in December, arguing that ChatGPT and Microsoft's Copilot are competing with the same outlets they are trained on and diverting web traffic away from the newspaper and other copyright holders who depend on advertising revenue generated from their sites to keep producing their journalism. It also provided evidence of the chatbots spitting out Times articles word-for-word. At other times the chatbots falsely attributed misinformation to the paper in a way it said damaged its reputation.

One senior federal judge is so far presiding over all three cases, as well as a fourth from two more nonfiction authors who filed another lawsuit last week. U.S. District Judge Sidney H. Stein has been at the Manhattan-based court since 1995 when he was nominated by then-President Bill Clinton.

THE RESPONSE

OpenAI and Microsoft haven't yet filed formal counter-arguments on the New York cases, but OpenAI made a public statement this week describing The Times lawsuit as 鈥渨ithout merit鈥 and saying that the chatbot's ability to regurgitate some articles verbatim was a 鈥渞are bug.鈥

鈥淭raining AI models using publicly available internet materials is fair use, as supported by long-standing and widely accepted precedents,鈥 said a Monday blog post from the company. It went on to suggest that The Times 鈥渆ither instructed the model to regurgitate or cherry-picked their examples from many attempts.鈥

OpenAI cited licensing agreements made last year with The Associated Press, the German media company Axel Springer and other organizations as offering a glimpse into how the company is trying to support a healthy news ecosystem. OpenAI is paying an undisclosed fee of news stories. The New York Times was engaged in similar talks before deciding to sue.

OpenAI said earlier this year that access to AP's 鈥渉igh-quality, factual text archive鈥 would improve the capabilities of its AI systems. But its blog post this week downplayed the importance of news content for AI training, arguing that large language models learn from an 鈥渆normous aggregate of human knowledge鈥 and that 鈥渁ny single data source 鈥 including The New York Times 鈥 is not significant for the model鈥檚 intended learning.鈥

WHO'S GOING TO WIN?

Much of the AI industry's argument rests on the 鈥渇air use鈥 that allows for limited uses of copyrighted materials such as for teaching, research or transforming the copyrighted work into something different.

In response, the legal team representing The Times wrote Tuesday that what OpenAI and Microsoft are doing is 鈥渘ot fair use by any measure鈥 because they're taking from the newspaper's investment in its journalism 鈥渢o build substitutive products without permission or payment.鈥

So far, courts have largely sided with tech companies in interpreting how copyright laws should treat AI systems. In a defeat for visual artists, a federal judge in San Francisco last year dismissed much of the first big , though artists have since amended their complaint. Another California judge shot down part of comedian against Facebook parent Meta but her case was amended in December and joined with another one that includes writers Ta-Nehisi Coates and Michael Chabon.

The most recent lawsuits have brought more detailed evidence of alleged harms, but Aggarwal said when it comes to using copyrighted content to train AI systems that deliver a "small portion of that to users, the courts just don鈥檛 seem inclined to find that to be copyright infringement.鈥

Tech companies cite as precedent Google鈥檚 success in to its online book library. The U.S. Supreme Court in 2016 let stand lower court rulings that rejected authors鈥 claim that Google鈥檚 digitizing of millions of books and showing snippets of them to the public amounted to copyright infringement.

But judges interpret fair use arguments on a case-by-case basis and it is 鈥渁ctually very fact-dependent,鈥 depending on economic impact and other factors, said Cathy Wolfe, an executive at the Dutch firm Wolters Kluwer who also sits on the board of the Copyright Clearance Center, which helps negotiate print and digital media licenses in the U.S.

"Just because something is free on the internet, on a website, doesn't mean you can copy it and email it, let alone use it to conduct commercial business," Wolfe said. "Who鈥檚 going to win, I don鈥檛 know, but I鈥檓 certainly a proponent for protecting copyright for all of us. It drives innovation."

BEYOND THE COURTS

Some media outlets and other content creators are looking beyond the courts and calling for lawmakers or the U.S. Copyright Office to strengthen copyright protections for the AI era. A panel of the U.S. Senate Judiciary Committee heard testimony Wednesday from media executives and advocates in a hearing dedicated to AI's effect on journalism.

Roger Lynch, chief executive of the Conde Nast magazine chain, planned to tell senators that generative AI companies 鈥渁re using our stolen intellectual property to build tools of replacement.鈥

鈥淲e believe that a legislative fix can be simple 鈥 clarifying that the use of copyrighted content in conjunction with commercial Gen AI is not fair use and requires a license,鈥 says a copy of Lynch's prepared remarks.

___

This story was first published on January 9, 2024. It was updated on January 10, 2024 to make clear that a lawsuit brought by artists against AI image-generators and another lawsuit against Meta brought by authors, including Sarah Silverman, have been amended after judges dismissed parts of each case.

The 好色tv Press. All rights reserved.

More Science Stories

Sign Up to Newsletters

Get the latest from 好色tvNews in your inbox. Select the emails you're interested in below.