Skip to Main Content

Generative Artificial Intelligence (GenAI)

Introduction

The use of Generative Artificial Intelligence (GenAI) tools raises a variety of copyright considerations. While these tools are rapidly evolving and there are many questions around how they work and how they can be used, it is important to remember GenAI and its related technologies and processes are subject to Canadian laws and regulations.

To better understand the issues discussed in this section, you are encouraged to review the library's Copyright Overview page and Humber Polytechnic’s Copyright & Fair Dealing policies.

Over the past 50 years, copyright legislation and case law have struggled to keep up with the pace of technological changes and the explosive growth in the methods of creating, distributing and consuming content. The relatively recent emergence of GenAI tools means the legal status of GenAI services and the outputs they generate is currently unsettled in Canada. This situation is not unique to Canada; the rest of the world is scrambling to define the legal boundaries of GenAI as well.

If the copyright issues with GenAI are so unclear, how can you successfully utilize these powerful tools, while at the same time avoiding activities that are potentially in violation of the law?

To try and answer these questions, let's review the key aspects of how copyright and GenAI intersect.

Content Development for Large Language Models

Large language models (LLMs) are the foundation upon which GenAI tools like ChatGPT are built and made functional. On an ongoing basis, companies like OpenAI (which created and operates ChatGPT) feed massive amounts of text and data into their LLMs to make them more accurate and responsive to the needs of the user.

It has become clear these companies have used content available on the open web and many other undeclared sources (scanned books, audio, video, etc.) to build their LLMs. Through these activities, OpenAI is now worth billions of dollars. If copyright is “the exclusive legal right given to a creator to reproduce, publish, sell or distribute their work” then it would be natural to assume that OpenAI received some form of permission to use copyrighted works (books, websites, etc.) to build such a valuable company.

It turns out OpenAI and similar companies did not get these kinds of permissions when they started building their LLMs. Instead, it appears as if they adhered to one of the underlying mottos of Internet-era entrepreneurship, which is move fast and break things, and race to put their products into consumers’ hands as fast as possible. This motto also implies entrepreneurs should have limited concern for all the potential merits and consequences of their product or the potential legal issues that may arise from how they have chosen to operate their companies. Through rapid growth they hope laws will be modified to accommodate their practices.

A variety of GenAI companies are now facing multiple lawsuits from authors, publishers, and artists. These legal cases boil down to the following legal arguments:

The View of Authors/Publishers/Artists

  • GenAIs have violated copyright law by copying and exploiting their works without permission.  
  • They should receive compensation for the practices GenAI companies have undertaken in the past and will in the future.
  • Governments should create laws and regulations which restrain the methods by which GenAI companies utilize copyrighted works.

The View of GenAI Companies

  • AI (and its potential benefits to society) cannot be developed without the ability to ingest copyrighted material.
  • What they are doing falls under specific exceptions (fair dealing, fair use, text and data mining, etc.) found in most nations’ copyright law.

These legal cases will likely take many years to settle. However, now that GenAI have reached such large valuations, they are starting to enter into partnerships with groups like publishers to avoid legal complications in the future.

Given the disputes around the legal practices of GenAI companies, how comfortable should you feel using them? From a strictly legal point of view there is no rational scenario that would see the millions of users of a tool like ChatGPT being made part of the lawsuits GenAI companies are now facing.

GenAI companies understand that given these lawsuits, some users are wary of using them, so they are taking steps to alleviate potential concerns. Microsoft offers a primarily text and image based GenAI platform called Copilot Protected which Humber currently provides access to. This version of the tool:

  • does not save prompts or answers
  • prevents Microsoft from having access to the data you provide as inputs
  • cites the data sources for the answers it provides
  • offers claims of assuming legal responsibility if copyright claims are made against you related to outputs you generate

OpenAI is offering similar protections for use of the enterprise version (paid) of ChatGPT called Copyright Shield.

Inputting Copyrighted Works Into GenAI

GenAI tools vary in the kinds of inputs they allow to then generate outputs. Some allow for text-based prompts. Others allow you to upload images, music and videos that can serve as the inspiration for various kinds of outputs.  

If what you input into the tool is your own work, you should be aware this work may end up being used by the tool to improve its LLM. The terms of use for OpenAI say you retain your ownership rights in what you input, and you own the output; however, they have the right to use the content to develop and improve their services.

But what about if you want to input a poem, a photograph, etc., that was created by someone else into a GenAI tool? Services like GenAI take the approach of including language in their terms of use that puts the onus on the user. For example, the terms of use for Open AI say that you cannot use their services in a way that “infringes, misappropriates or violates anyone’s rights”. In other words, you need to determine if the amount of a copyrighted work you choose to input into the tool falls within the limits of the copyright exceptions in your country (e.g. fair dealing in Canada).

Say for example you want to create an image of a large, walking, talking, yellow bird.

In Figure A, the above prompt was entered into a GenAI tool, and the platform outputted the following:

Figure A

You may think this image is suitable for your purposes, but what you really want is an image that looks more like the Sesame Street character, Big Bird. The GenAI tool allows you to upload a picture of Big Bird as inspiration, and you get an image like Figure B.

Figure B

According to OpenAI's terms of use, you the uploader are responsible for determining if your inputting of the Big Bird image is being done in such a way that it does not violate the rights of the copyright holder of the original image.

It is challenging for non-lawyers to make these sorts of legal determinations so a general guideline to follow is to not use someone else’s copyrighted material in a GenAI tool unless you have permission, a license to do it or you are comfortable that it falls within the exceptions to copyright in Canadian law.

Other options for content to use in GenAI tools include openly licensed material and material in the public domain (copyright has expired).

Copyright Status of Works Outputted by GenAI

The copyright status of works created by GenAI is unclear. Traditionally, Canadian law has indicated that an author must be a human who exercises skill and judgement in the creation of a work. There are those that argue that there are varying degrees of human skill and judgement that goes into the final product that is outputted from a GenAI tool and therefore they are eligible for some form of copyright protection. But currently in Canada, there is no legal framework to determine who the appropriate author/owner of AI generated works are.

GenAI tools vary in their terms of use, so some may make broader claims of ownership of what is generated by the tool. While the copyright status of what is created may be unclear, terms of use fall into areas of contract law which usually favors the provider of the service and not the user when there is a legal dispute.

It is likely going to be that case that most users of GenAI tools will try to claim that the works they generate are their copyrighted works and should be granted the same protection as if they created the works in traditional digital tools like Photoshop, etc. It will take cases going through the legal system and updated laws and regulations to better define what AI associated content can be granted copyright protection.