Reference giants launch copyright fight against OpenAI

MANHATTAN (CN) — Encyclopedia Britannica and its Merriam-Webster subsidiary sued OpenAI in New York City federal court on Friday, accusing the Microsoft-backed artificial intelligence pioneer of illegally copying nearly 100,000 online articles to train ChatGPT.

Filed in the Southern District of New York late Friday evening, Encyclopedia Britannica and Merriam-Webster accused OpenAI in their 44-page civil complaint of violating their copyrights by scraping online content at massive scale to train large language models that power the popular AI chatbot.

The English encyclopedia company, which dates to the late 18th century, claims ChatGPT-generated responses to users’ queries include verbatim or near-verbatim reproductions of their copyrighted articles and digital content.

“Defendants’ ChatGPT-based AI products free ride on plaintiffs’ trusted, high quality content — made possible through the diligent work of human researchers, writers, editors and creators — by cannibalizing traffic to defendants’ websites with AI-generated summaries of plaintiffs’ own content,” Encyclopedia Britannica wrote in the five-count civil complaint, which features four counts of copyright infringement and one count of trademark dilution.

Encyclopedia Britannica claims it reached out to OpenAI to discuss potential licensing opportunities, including an initial discussion in November 2024 that went nowhere.

“After that discussion, an OpenAI representative rebuffed plaintiffs’ licensing outreach, and OpenAI never seriously pursued licensing plaintiffs’ content,” Encyclopedia Britannica wrote in the complaint. “Instead, despite entering into licensing deals with other similar publishers, defendants continued to copy plaintiffs’ content without compensating plaintiffs.”

An OpenAI spokesperson on Monday said ChatGPT’s language models “are trained on publicly available data and grounded in fair use.”

“ChatGPT helps enhance human creativity, advance scientific discovery and medical research, and enable hundreds of millions of people to improve their daily lives,” a company spokesperson told Courthouse News.

Encyclopedia Britannica rebuts the fair use defense, asserting in the complaint that OpenAI’s “misuse of plaintiffs’ copyrighted works is also not transformative.”

“A transformative work adds something new, with a further purpose or different character, altering the original with new expression, meaning, or message,” the reference giants wrote. “Instead, ChatGPT copies the expression, meaning and message of copyrighted content, including that of plaintiffs, and repackages it to the consumer. ChatGPT adds no new expression, meaning or message of their own.”

OpenAI was founded as a nonprofit in 2015 and its nonprofit board has continued to control the for-profit subsidiary that now develops and sells AI products.

Last month, OpenAI announced its $840 billion post-money valuation after a record-breaking $110 billion funding round.

In addition to the copyright infringement counts, Encyclopedia Britannica and Merriam-Webster additionally accuse the ChatGPT maker of engaging in trademark violations by falsely attributing so-called “hallucinations” of factually incorrect statements to them, and by misleadingly omitting content while reproducing their works.

Encyclopedia Britannica sued OpenAI’s competitor Perplexity last September in Manhattan federal court, similarly accusing the San Francisco-based AI startup — and ChatGPT competitor — of also scraping and plagiarizing hundreds of thousands of copyrighted online articles to feed Perplexity’s “answer engine” chatbot.

Follow @jruss_jruss

Categories / Business, Media, Technology

Subscribe to our free newsletters

Our weekly newsletter Closing Arguments offers the latest about ongoing trials, major litigation and rulings in courthouses around the U.S. and the world, while the monthly Under the Lights dishes the legal dirt from Hollywood, sports, Big Tech and the arts.