Updates to our Terms of Use

We are updating our Terms of Use. Please carefully review the updated Terms before proceeding to our website.

Monday, May 13, 2024 | Back issues
Courthouse News Service Courthouse News Service

GitHub mounts second attempt to toss anonymous code writers’ claims of software piracy

A group of anonymous code writers are attempting to advance their remaining open source license breach claims against companies connected to GitHub's CoPilot, trained on public repositories of code scraped from the web.

OAKLAND, Calif. (CN) — GitHub and other companies are attempting for the second time this year to throw out claims from anonymous code writers that their coding assistant violated open source licenses protecting their work. 

The software development company, which allows developers to store and manage their code, made its case Thursday before a federal judge in a case which questions whether the AI-powered coding assistant GitHub CoPilot relies on “software piracy on an unprecedented scale.” That claim comes from the code writers’ class action, filed in 2021 against Microsoft, GitHub and OpenAI.

CoPilot, which GitHub unveiled in 2021, is trained on public repositories of code scraped from the web. It uses artificial intelligence to absorb all of the code in GitHub, and some code has been published using licenses requiring anyone reusing the code to credit the creators. Because the tool sometimes produces strings of licensed code without providing credit, the plaintiffs want penalties in excess of $9 billion for any alleged infringement. 

An example of how GitHub's CoPilot software suggests code to its users (CoPilot website via Courthouse News)

The defendants say in another motion to dismiss filed in June that the claims fail to prove injury and a viable claim for damages. They say that CoPilot helps developers write code by generating suggestions based on what it has learned from the entire body of knowledge gleaned from public code.

Attorney Joe Gratz, representing OpenAI, told U.S. District Judge Jon Tigar in a hearing Thursday that the plaintiffs failed to show a plausible standing for damages in their amended complaint. He said the plaintiffs’ code being used by CoPilot actually supports the defendants’ arguments, because it is "not plausible" that a company would go to great lengths to steal it. 

“It is a rare case where plaintiffs adding more claims made their ability to show standing worse. But we think that’s what happened here,” he said. 

Attorney Annette Hurst, representing Microsoft and GitHub, said the plaintiffs’ claims that CoPilot is a derivative tool fails because they claim that the tool was trained on their code, when the tool is “not capable of regurgitating verbatim” independently designed code.

“The (California) Copyright Act is a delicate balance. It’s also about how those rights are limited,” Hurst said. She said the legal standard of preemption sweeps more broadly than protections under the act in order to preserve the act’s limitations and prevent plaintiffs from using state law to evade those limitations.

“Code has a thin copyright,” Hurst added. “The idea vs. expression dichotomy is going to be very significant in this case, as is fair use.” 

Attorney Matthew Butterick, representing the plaintiffs, said the case is about open source licenses, not copyright infringement. CoPilot can in fact reproduce “nearly verbatim” code streams, he said. 

“If you are a human programmer using this code which is subject to an open source license, the moment you do so you incur the obligations of the license. You must provide attribution and a copy of the license,” Butterick said — adding that CoPilot should follow the same rules. 

Another attorney for the plaintiffs, Joseph Saveri, said that state tort laws around unauthorized use are not preempted by the Copyright Act if they include an additional element. He said that the plaintiffs’ unauthorized use claims meet this requirement.

The judge asked if the claims at their core rest on the plaintiffs having an exclusive right to control the distribution of certain code to the public. Saveri said the claims cite sufficient rulings to show precedent that property rights created by open source licenses are actually different.

Gratz said in rebuttal that the plaintiffs’ claims that part of their code has been used verbatim are not plausible because they did not prove that their code was ever “spit out” by CoPilot.

“Unsurprisingly, you’re framing the question in the way that’s helpful to you,” Tigar said. “That’s not the only way of looking at it.” 

Hurst said the question is not whether plaintiffs’ code was used, but whether training CoPilot with the code would cause plaintiffs economic injury and damages.

Tigar did not indicate which way he may rule, or when he will issue an order.

“I think you have an uphill battle on preemption,” he told the plaintiffs. 

The parties return to court Dec. 20.

In May, the judge tossed many claims against GitHub and the other companies, including dismissing with prejudice claims of civil conspiracy and declaratory relief. However, Tigar advanced the plaintiffs’ claims for breach of license and injury to property rights. He also allowed them to take another crack at their copyright preemption claim because their state law claims are qualitatively different from claims under the Copyright Act.

Follow @nhanson_reports
Categories / Business, Law, Securities, Technology

Subscribe to Closing Arguments

Sign up for new weekly newsletter Closing Arguments to get the latest about ongoing trials, major litigation and hot cases and rulings in courthouses around the U.S. and the world.

Loading...