Copyright Infringement And AI
Learn how the integration of AI would affect liability allocations
Posted on 04-30-2024, Read Time: 5 Min
Share:
Highlights:
- As AI becomes more prevalent, legal questions arise regarding liability and copyright issues, particularly concerning generative AI solutions.
- Generative AI solutions use training data, potentially infringing on copyrights. However, proving infringement can be complex due to the nature of AI-generated output.
- The intersection of generative AI and copyrights raises unresolved legal issues, leading to ongoing litigation.

The majority of these issues are about how the integration of AI will affect liability allocations in situations where the AI’s “judgment” was relied upon to perform a task that traditionally involved the use of only human judgment. These liability implications are crucial to define and comprehend, especially in high-risk implementations involving, for example, healthcare applications and large equipment operations (e.g., autonomous vehicles).
While the effects of replacing human judgment with AI judgment are important issues to consider, in at least some AI implementations, there are other, more fundamental, legal issues that must be resolved.
Many AI solutions are developed via a training or learning process that develops a dynamic algorithm or model that is the core of the AI solution. These AI solutions are referred to as generative AI solutions.
Generative AI solutions often leverage information from the internet and other available data sources to “train” the model using this training data. In some instances, generative AI training processes may be implemented continuously in an otherwise unsupervised manner to develop the model. Ultimately, the model may be used by the generative AI solution to, for example, generate new content, such as images, video content, text content, articles, poems, stories, compositions, sound recordings, and computer code.
Because generative AI offers the ability to create new content quickly, with minimal human contribution and talent, the use of generative AI has become widespread and rapidly adopted. By simply inputting a natural-language request by a user, the generative AI solution can rapidly generate new content that traditionally required substantial human talent and time to create. Such efficiency and ease of use have led to high adoption and ongoing utilization of such generative AI solutions.
An aspect of some generative AI solutions that many casual users may not appreciate is that the AI output is still based on the data sources used for training, which are unlikely to be owned by the AI solution provider or the end user. In other words, the AI solution generates the new content from a model built on training data that was unlicensed and owned by third parties.
Since copyrights often protect at least some of this data, a question arises as to whether such use of the data by the AI solution constitutes a copyright infringement. With this in mind, it is important to appreciate that the AI output is not developed in a vacuum, but it is an algorithmic output that is likely based on copyrighted data that is owned by, for example, third-party artists and authors.
The legal framework for copyrights protects the original works of authors and artists by offering a cause of action for copyright infringement. An infringement involves any copying of a substantial portion of a copyrighted work. It is noteworthy that the copying need not result in an identical copy, but in most instances, substantial portions of the copyrighted work must be included in the infringing work.
With respect to generative AI, the output may not include a substantial portion of any one copyrighted work, and for that reason, claims of copyright infringement due to the use of copyrighted works as training data may be difficult to establish. In other words, the AI output may not include enough material/content from the copyrighted work to raise an infringement based solely on the AI output.
However, some have argued that in the process of training the AI model, some copying of the copyrighted training data is necessary, and, therefore, those instances of copying could constitute a copyright infringement. Unfortunately for the copyright owners, it is often difficult to determine exactly whether the training data includes an individual’s copyrighted work, and how the training process integrates such works into the process without being able to analyze, for example, the data sets and the computer code that have been used.
Adding to the complexities, a common defense to copyright infringement is fair use. The legal doctrine of fair use permits the use of unlicensed, copyrighted works in some instances if the alleged copyright infringer can establish that their use of the copyrighted work constitutes fair use. Factors considered in the fair use analysis include the purpose and character of the use, the nature or type of work being copied, the amount of the copyrighted work that has been copied, and the effect on the market for the copyrighted work.
With respect to fair use and copyright infringement, generally, the analysis of infringement associated with the generative AI output is different from the copyright infringement analysis associated with the model training process. With respect to the AI output analysis, if the output is quite similar to the copyrighted work, then infringement is more readily established. However, it is often the case that the AI output lacks a substantial portion of any one copyrighted work, making the case for copyright infringement for the AI output more difficult.
With respect to the copying that may be included in the AI model training process, copying of an entire copyrighted work may be more readily established if details of the training process are known. If such copying during training can be established, arguments can still be made that copying associated with generative AI is protected under fair use. For example, generative AI models, themselves, have a very different purpose and character from the underlying copyrighted works, which could be sufficient to establish a fair use defense for generative AI.
The uncertainty at the intersection of generative AI and copyrights has been, and continues to be, an issue that will need to be resolved. Parties are currently in the midst of litigation to find a framework to address this issue. Original content generators, including artists and authors, believe they are receiving no compensation for their works, while those same works are being leveraged and monetized for commercial gain by the AI solution providers.
While there are active efforts to define law to address many of the legal issues being raised by AI, it is yet to be seen whether courts and lawmakers will find ways to address these issues under new and existing legal frameworks. It is, however, clear that AI solutions are beneficial for handling a vast number of tasks and problems.
As such, finding ways to embrace and promote this technology, while also benefiting contributors who intentionally or unintentionally provide content for model training, is an issue that will need to be resolved in the very near future.
This article first appeared here.
Author Bio
![]() |
Nathaniel Quirk is a Partner at Burr & Forman LLP. Nathaniel focuses his practice on securing and protecting intellectual property rights in the areas of patents, trademarks, copyrights, and trade secrets. |
Error: No such template "/CustomCode/topleader/category"!