AI and copyright

Post by Cristina Rusu, Copyright manager, Loughborough University

A humanoid robot pondering mathematical calculations on a blackboard
By Mike MacKenzie, CC-BY, via Flickr

As AI develops, an increasing number of news items are released regarding court cases of copyright infringement by ChatGPT, Stability AI and many, many more. Why is it that there are such issues regarding AI and copyright?

Copyright is an Intellectual Property (IP) right which grants the creator of a work the right to allow or prevent copying of said work. Although works are protected regardless of their artistic merit, the work does need to be created by a “natural person”. For example, software code is protected by copyright; what the code does is not.

Generative AI works, ChatGPT for example, because these algorithms have been fed an extensive amount of data. The outputs these AIs create are a combination of that data and the user’s input.  

AI is not limited to just written outputs; some AIs have been trained in creating the following media:

  • Image
  • Speech
  • Video
  • Music
  • Code.

So, what is the problem? In the beginning, companies have been open about where the data came from, and now they have become more and more secretive about it. While scrapping the internet and text and data mining (TDM) can be done for research purposes, under copyright exceptions, the access needs to be legal and the use non-commercial. Generative AI software is now mainly behind paywalls. It is also telling that something is not rosy, considering the number of court cases happening in the UK, EU and US. You can read more about some of the lawsuits below:

Artists Are Suing Artificial Intelligence Companies and the Lawsuit Could Upend Legal Precedents Around Art by Shanti Escalante-De Mattei

3 Lawsuits in 10 Days: Who Is Suing OpenAI, and Why? By Allison Burt

AI learned from their work. Now they want compensation. By Gerrit De Vynch

Authors file a lawsuit against OpenAI for unlawfully ‘ingesting’ their books by Ella Creamer

Let’s consider that the inputs (the data that the Generative AIs have been fed) are under copyright, and the use has potentially been unlawful. It also means that the outputs are plagiarized at the least and copyright infringement at worst.

However, there are other issues to be considered when using Generative AI. According to UNESCO’s quick start guide on ChatGPT, has highlighted the following issues:

Issues 
Academic integrityPlagiarism and cheating
Lack of regulationSecurity issues
Privacy concernsNo age-regulation
Cognitive biasBiased ideas and perpetuates bias
Gender and diversityStereotyping and discrimination
AccessibilityLack of access in certain countries
CommercializationExtracting data for commercial purposes
Table 1 Issues with using ChatGPT, UNESCO, 2023, p.11  

Alex Fenlon, Head of Copyright and Licensing in Library Services at the University of Birmingham, has also highlighted other risks associated with using Generative AIs.

Aleksandr Tiulkanov, AI and Data Policy Lawyer, created a useful flowchart to help assess when ChatGPT is safe. You could ask yourself: Does it matter if the outputs are true? Do you have the knowledge to verify the accuracy of the output? Are you willing and able to take full responsibility (legal, moral, etc.) for missed inaccuracies? If you answered yes to some or all of these questions, then you are free to use ChatGPT; if you answered no, you may want to re-think your use of the GenAI.

According to UNESCO’s ChatGPT quick guide, there are possible uses of ChatGPT in the research process:

It could be used in:

  1. Research Design: generate ideas for research questions or projects; suggest data sources.
  2. Data collection: search archives and datasets; translate sources into other languages.
  3. Data analysis: code data; suggest themes or topics for analysis.
  4. Writing up: improve writing quality; reformat citations and references; translate writing.

One point to make here is that everything that is inputted within these AIs will be used as training data. Make sure you own the data you input and are happy for it to be re-used to train the AIs. Always read the terms and conditions. If in doubt, get in touch with your copyright officer or library.

Disclaimer: The information presented here does not reflect the views of Loughborough University.

Further reading

Guadamuz, Andres, A Scanner Darkly: Copyright Liability and Exceptions in Artificial Intelligence Inputs and Outputs (February 26, 2023). Available at SSRN: https://ssrn.com/abstract=4371204 or http://dx.doi.org/10.2139/ssrn.4371204

Lee, Jyh-An, Computer-generated Works under the CDPA 1988 (November 5, 2021). Artificial Intelligence and Intellectual Property (Jyh-An Lee, Reto Hilty & Kung-Chung Liu eds, Oxford University Press, 2021) , The Chinese University of Hong Kong Faculty of Law Research Paper No. 2021-65, Available at SSRN: https://ssrn.com/abstract=3956911

Guadamuz, Andres, Do Androids Dream of Electric Copyright? Comparative Analysis of Originality in Artificial Intelligence Generated Works (June 5, 2020). Intellectual Property Quarterly, 2017 (2), Available at SSRN: https://ssrn.com/abstract=2981304