How to use generative AI legally and safely

AI is starting to be used in more and more fields. But what should companies do to utilize ChatGPT without the risk of copyright infringement?

Generative AI, which creates content by itself, such as text capture, image creation, and program code writing, is being used in more and more fields. However, many companies are thinking about how to use generative AI like ChatGPT without the risk of copyright infringement and copyright lawsuits.

Companies looking to leverage generative AI must answer four questions about copyright:

  • Who ‘owns’ the output created by artificial intelligence such as ChatGPT?
  • How do AI operators protect themselves when training AI?
  • Is there a risk of infringement of the copyright of the work of learning material by using the output of AI?
  • Can AI tools also be used under copyright law?

Ownership of output created by generative AI

When a company uses generative AI to create content, such as text, images, or program code, there is generally no copyright to the result to either the user or the AI ​​operator, such as OpenAI. This is because the copyright law presupposes a human being as the author of a creative work. In other words, the result of artificial intelligence is not a copyrighted work under the copyright law. In other words, legally, the results do not belong to anyone.

However, it can be evaluated differently if the prompt input, i.e. the user’s command to the AI, can already be considered as an individual’s intellectual creation. AI plays only a secondary role in producing outcomes. In this case, the AI ​​is more of a tool that carries out instructions rather than doing creative work independently. Of course, the user’s prompts must be very detailed for this exception to apply.

A typical example is when a user provides their text to ChatGPT and performs the task of proofreading the language of the text. On the other hand, the prompt “Please summarize the progress of quantum computing over the past 12 months” does not generate copyright as the output is a large text written by ChatGPT.

The role of contracts between users and AI operators

The AI ​​operator’s terms and conditions do not change this legal relationship. A contract can only regulate the relationship between the contracting parties, but cannot create general vesting rights unless the subject matter of the contract is protected.

Therefore, the granting of the right to use the output is effective only between the parties, and the party granting the right to use has only the effect of promising the contracting party to refrain from using the output. Then the contracting party cannot directly use the result or allow a third party to use it.

Neither the party granting the “license” nor the contracting party can prevent third parties from using the results. The contracting party can only demand an injunction and, if necessary, compensation for damages from the party who granted the ‘right to use’.

Model training by AI operators can be legally protected. According to German law, materials freely available on the internet can be used for AI training, which amounts to data mining rights. However, the author must not file objections in a machine-readable form.

Role of Legal Principles

To avoid copyright infringement, AI operators must design the learning process in such a way that the AI ​​respects the relevant notices of rightholders on the Internet. In any case, the operator must ensure that material reproduced for learning purposes is deleted after the learning phase is over.

Of particular importance in this context is that AI must be programmed not to incorporate, or at least alter, the original data into its results, since only analysis is allowed. This is only possible with the express consent of the relevant rights holders.

Even when a user reproduces or reuses a product containing copyright infringement content, it may be subject to copyright infringement. This is the case, for example, when storing results in user-controlled memory.

In the case of purely private use of the material, the user may in fact exercise the right of private reproduction. However, this is only possible if the copied work is obviously not illegally reproduced or made publicly accessible.

A crucial factor in obviousness is whether an objective observer can recognize illegality. Not all copyrighted material has the right to private reproduction. The right to private reproduction does not apply to the code of computer programs or to databases.

Crucially, when using copyrighted creations, AI will not always cause copyright infringement if the original work is modified so that the original material is no longer recognizable.

How AI Infringes Copyright

Put another way, the more similar the work is to the copyrighted material, the higher the likelihood of copyright infringement. Let’s look at two examples.

When the user’s prompt executes simple reproduction (e.g. output of song lyrics) or direct translation of text by ChatGPT, the copyright protection of the original material applies equally to the output. Copyright infringement occurs when the resulting work is used in a manner not authorized by the author of the original material and not subject to copyright protection.

However, as mentioned earlier, the situation is different when it comes to generating non-fiction texts, such as prompts for quantum computing. Here, the output of chatGPT is generally markedly different from the text used by AI for training. Through self-formulations, new semantic contexts, blending or other emphasis, ChatGPT produces new text that resembles human-written technical documentation, so it rarely causes copyright infringement.

Dangers of the gray area

While there are clear examples of this, there is a broader gray area. The extent to which the source material is recognizable in the output is often only evident by examining individual cases.

Many companies question whether generative AI tools like ChatGPT can be used while complying with copyright laws. In particular, when users train AI with data from unknown sources, user companies face a dilemma.

Generative AI tools can be used without infringing on copyright, but users must receive confirmation from AI operators that they have complied with copyright laws, particularly blocking notices, when training AI. This does not prevent rights holders from taking legal action against users if AI operators do not keep their promises.

However, depending on the type of prompt, you should at least be able to assess the risk. Another conceivable AI solution is to verify that AI-generated artifacts are identical or very similar to copyrighted works before commercial use.

When companies use data to train their AI solutions on their own, there are things to consider to avoid problems in the future. In particular, it is necessary to check whether the learning data does not contain copyrighted works that the author has not permitted to use.

When materials are available online, companies can trust that any accessible material that is not accompanied by a block notice is available for evaluation. However, you should carefully check material with block notices.

Also, don’t forget to delete the used material after the learning phase is complete. This applies not only to text generators, but also to image-generating AIs such as Midjourney, Stable Diffusion or Dali.

Compliance with copyright laws is not the only determining factor for legally safe use of AI. Possible violations of individual rights or data protection laws are also important issues for companies to consider.
