MANHATTAN (CN) — During a court hearing Tuesday morning, lawyers for tech giants Microsoft and OpenAI defended the tech companies’ practice of “scraping” enormous quantities of online news stories to train the large language models used by their chatbots, and urged a federal judge to dismiss various copyright infringement claims by The New York Times, The New York Daily News and other news organizations.
In response to the publishers’ claims that OpenAI and Microsoft’s generative AI systems illegally copied the content from their newspapers and regurgitated that content back in response to chatbot users’ prompts, the companies’ lawyers argued that certain copyright claims for infringement based on OpenAI’s creation and use of training datasets for GPT-2 and GPT-3 should be tossed out as time-barred, under the statute of limitations for copyright law.
During oral arguments on Tuesday Latham & Watkins attorney Andrew Gass said The New York Times waited to bring its copyright claims more than three years after OpenAI published papers disclosing that the paper was included among ChatGPT’s training models.
Gass said the Times’ suit came more than three years after newspaper published an article about OpenAI’s groundbreaking generative AI technology.
While OpenAI insists that the enormous datasets, including news stories used to train its artificial intelligence bots are protected by the fair use doctrine, the hearing on Tuesday concerned ancillary claims aside from the core issue of whether using copyrighted content to train a generative AI model is fair use under copyright law.
OpenAI and Microsoft filed motions to dismiss the complaints, arguing multiple defenses including that the publishers fail to state a claim that the AI developers contributed to “end-user copyright infringement,” or encourage users to prompt the GPT-based products to produce content similar to the publishers’ articles.
OpenAI attorney Joseph Gratz said ChatGPT has guardrails to prevent the wholesale copying of news articles in response to users’ prompts. “That’s not what it’s for, that’s not what it’s designed to do, and that’s not what it does do,” he said, affirming that the newspapers have to deliberately coax the chatbot to produce such infringing responses to chat prompts.
New York Times attorney Ian Crosby warned that the predicted consequences of generative AI’s harm to news publisher is “dire,” with 30-50% of online news traffic going to be diverted and not returning to original sources of news stories.
U.S. District Judge Sidney Stein did not immediately rule on the motions to dismiss at the conclusion of the 2 1/2-hour in-person hearing.
Stein, a Bill Clinton appointee, noted during the hearing that he concurred that core claims involving fair use “ultimately would be the issue” in the case.
The 79-year-old judge spoke with a familiarity on the underlying technology, and asked hypothetical questions about the chatbots responding to prompts asking for recent New York Times articles on current news items like the Southern California wildfires and the Senate confirmation hearing of President-elect Donald Trump’s pick to run the Department of Defense, Pete Hegseth.
The New York Times first brought a federal complaint in December 2023 against OpenAI and Microsoft seeking to end the practice of using its stories to train their respective chatbots, ChatGPT and Microsoft Copilot, formerly known as Bing Chat.
A coalition of eight newspapers owned by the MediaNews Group and Tribune Publishing companies subsequently separately sued OpenAI and Microsoft four months later claiming large-scale copyright infringement of the publishers’ articles without permission or payment, to fuel the commercialization of their AI products including ChatGPT and Copilot.
The publishers claim that collectively, content from their websites accounts for at least 124 million basic pieces of text included in the Common Crawl depository of data constantly dredged from the open internet and used to train the software’s large language models.
The Center for Investigative Reporting (CIR) — the United States’ oldest nonprofit newsroom — also sued the tech companies in June 2024
In a court filing, OpenAI derided the Daily News-led lawsuit as a “copycat” complaint modeled after the Times’ case.
The ChatGPT maker further argues that those regional and local newspapers’ trademark claims fail because their trademarks “are not sufficiently famous to support a claim for dilution.”
Prior to deciding to sue, The New York Times had engaged in talks with OpenAI about negotiating a fee to license the newspaper’s archive to the artificial intelligence company.
Last fall, Microsoft reported its quarterly sales grew 16% to $65.6 billion as the company sought to assure investors its enormous spending on artificial intelligence is paying off.
The company has invested billions of dollars to expand its global network of data centers and other physical infrastructure required to develop AI technology that can compose documents, make images and serve as a lifelike personal assistant at work or home.
Subscribe to our free newsletters
Our weekly newsletter Closing Arguments offers the latest about ongoing trials, major litigation and rulings in courthouses around the U.S. and the world, while the monthly Under the Lights dishes the legal dirt from Hollywood, sports, Big Tech and the arts.


