Intro
Meta and OpenAI, the tech startups behind the AI technology that has grown increasingly popular recently, ChatGPT, are facing legal trouble. They are being sued by several authors on the grounds that they used the authors copyrighted novels in training their AI systems without consent of the authors. The lawsuit has become a class-action including authors Sarah Silverman, Richard Kadrey, and Christopher Golden. A suit, remarkably similar, was filed several weeks prior by authors Mona Awad and Paul Tremblay also on copyright infringement against OpenAI.
Main Issues
The question at the core of the case is whether the method of AI training is fair use of copyright. Section 107 of the Copyright Act provides guidance on what constitutes fair use: (1) the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes; (2) the nature of the copyrighted work; (3) the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and (4) the effect of the use upon the potential mark for or value of the copyrighted work.[1] This Fair Use doctrine can sometimes be seen as the exception to copyright infringement. This lawsuit specifically calls into question the legality of the way information is acquired. ChatGPT is designed to consume substantial amounts of text to train itself, using some legal libraries and some illegal shadow libraries. Shadow libraries are online databases that allow copyrighted works, or other normally inaccessible works, to be accessed for free – which is illegal since they lack the credentials to legally access the works. It was confirmed that the book data includes “all of Bibliotik,” a known shadow library that has 196,640 books in it. The plaintiff displayed evidence that the authors’ works were used in the form of copied conversations with ChatGPT, in which it could accurately summarize the works of Silverman, Kadrey, and Goldman. The lawsuit accused Meta and OpenAI of being complicit in feeding its models with information, despite knowing that they were doing it illegally. They claimed this use of shadow libraries is “flagrantly illegal,” and the authors are seeking for the AI companies to compensate for their stolen works.
What Is To Come
This new generation of AI technology has caused a lot of fear and uncertainty for what exactly the limits of it are. There is a large fear for what AI means for the future of these jobs, including journalism, writing, filmmaking, TV writing, and other similar paths. AI, arguably, can do all of these very similar to the degree, or level, that humans can. But they rely on a considerable sum of information and if they have not legally acquired this information, that could be a big game changer. AI outputs may also infringe on copyrights in other works if the program: 1) had access to their works; and 2) produced “substantially similar” outputs. These cases indicate a pattern of authors taking action to protect their copyrighted works against AI programs, but will they be successful? And how will this change AI if they are? And who owns the final product generated by ChatGPT? Can anyone claim it, if it only consists of information already in existence, presumably having copyright protection? Will the ChatGPT iterations be considered derivative works? Do the benefits outweigh the risks when it comes to AI and ChatGPT? We have yet to see.
[1] U.S. Congress. (1958) United States Code: Copyright Office, 17 U.S.C. §§ 107.