The EU Digital Omnibus Agreement and AI Act Article 53: Reshaping Copyright Licensing for General-Purpose AI Training

AI-ACT

Introduction

On 7 May 2026, negotiators from the European Parliament, the Council of the European Union, and the European Commission reached a provisional political agreement on the so-called Digital Omnibus package concerning the AI Act. Among the most consequential outcomes was the decision to preserve the original enforcement timeline for key obligations applicable to General-Purpose AI (GPAI) models. In particular, the transparency and copyright-related requirements under Article 53 of the AI Act remain firmly anchored to the 2 August 2026 deadline. This decision has immediate and far-reaching implications for how providers of foundation models source, document, and license the vast quantities of text and data used to train their systems.

Article 53 requires GPAI providers to implement policies ensuring compliance with Union copyright law and to publish a sufficiently detailed summary of the content used for training their models. By refusing to delay these obligations despite industry requests for more time, the co-legislators have sent a clear signal: the era of largely unregulated large-scale scraping of copyrighted material for AI training is coming to an end within the European market. AI developers have been forced into an urgent compliance scramble, auditing their training datasets, reassessing licensing arrangements, and preparing public disclosures that will inevitably reveal the scale and provenance of the data they have used.

Article 53 Obligations: Copyright Policies and Training Data Summaries

The core of Article 53, as clarified and maintained by the May 2026 Omnibus agreement, imposes two interrelated duties on providers placing GPAI models on the EU market or putting them into service. First, they must adopt and implement internal policies to ensure that the training, development, and operation of their models respect applicable copyright rules. Second, they must make publicly available a detailed summary of the content used for training, including identification of the main datasets and their sources. These obligations apply irrespective of whether the provider claims to rely on text-and-data-mining exceptions under the DSM Directive or on contractual licences.

The requirement to publish a “sufficiently detailed summary” is particularly significant from a licensing perspective. It effectively compels providers to disclose, at a meaningful level of granularity, the corpora, repositories, and individual sources from which training data was drawn. This transparency is intended to enable copyright holders to assess whether their works were used and, if so, to determine whether they were used lawfully, either under an exception or pursuant to a licence. In practice, it shifts the information asymmetry that has long characterised large-scale AI training and creates new leverage for rights holders in licensing negotiations.

Implications for Training Data Licensing Markets

The maintenance of the August 2026 deadline has already begun to reshape licensing practices around training data. Providers who previously relied predominantly on broad web scraping or on datasets whose provenance was opaque are now actively seeking to regularise their data supply chains. This has manifested in several concrete developments: accelerated negotiations with publishers, news organisations, stock image libraries, academic repositories, and specialised data providers for explicit training licences; the emergence of new contractual templates that address not only usage rights but also attribution, audit rights, and liability allocation; and increased scrutiny of existing data partnerships to determine whether they cover the scale and modalities of use required for foundation model training.

For copyright owners and data licensors, Article 53 creates both opportunity and complexity. On the opportunity side, the transparency obligation makes it materially easier to detect unauthorised use and to initiate licensing discussions from a position of greater information parity. Rights holders can now more credibly demand compensation or other terms for the use of their content in training data. On the complexity side, many existing licences were not drafted with AI training in mind. Questions of scope, whether a licence for “text and data mining” or “machine learning” encompasses the creation of foundation models that may be used for a wide range of downstream generative tasks, are now being tested in real time. Licensors are revisiting reservation-of-rights language, while licensees are seeking broader grants and clearer safe harbours.

Compliance Risk, Enforcement Exposure, and Contractual Allocation

Non-compliance with Article 53 carries significant financial exposure. The AI Act provides for administrative fines of up to €35 million or 7% of total worldwide annual turnover, whichever is higher, for the most serious infringements. While the precise enforcement posture of national competent authorities and the European AI Office remains to be seen, the combination of a hard deadline, public disclosure requirements, and substantial penalties has concentrated minds across the AI industry. Providers are therefore investing heavily in technical and legal due diligence on their training datasets, including provenance tracking, rights clearance where feasible, and the implementation of internal policies that can withstand regulatory scrutiny.

These compliance efforts directly influence licensing negotiations. AI companies are increasingly willing to accept more stringent contractual terms, including audit rights, indemnification provisions, and ongoing reporting obligations, in exchange for licences that demonstrably reduce regulatory risk. Conversely, sophisticated licensors are using the Article 53 disclosure requirement as leverage to obtain not only higher fees but also contractual commitments that the licensee will maintain accurate records and will not rely on the licence to justify scraping beyond its scope. The result is a rapid professionalisation of training-data licensing arrangements that were previously informal or non-existent.

Interaction with Text-and-Data-Mining Exceptions and Broader Copyright Framework

Article 53 does not eliminate the text-and-data-mining exceptions available under Articles 3 and 4 of the DSM Directive. However, it imposes an additional layer of transparency and policy obligations even where a provider purports to rely on those exceptions. A provider that claims to train exclusively on lawfully accessible works for which it has not obtained specific licences must still publish a detailed summary of the datasets used and must maintain copyright compliance policies. This dual structure creates a strong incentive for providers to move toward licensed data wherever feasible, both to strengthen their compliance narrative and to reduce the risk that a regulator or court will later conclude that reliance on an exception was not justified for the scale or commercial nature of the training activity.

The Omnibus agreement’s decision to keep the timeline intact also has implications for ongoing policy debates about whether existing exceptions are adequate for foundation model training. By forcing providers to confront the practical difficulties of large-scale rights clearance, the regime may accelerate the development of collective licensing mechanisms, extended collective licensing schemes, or sector-specific agreements between AI developers and rights-holder organisations. At the same time, it places pressure on Member States and the Commission to provide clearer guidance on the interplay between Article 53, the DSM exceptions, and the sui generis database right.

Global Ripple Effects and Strategic Responses

Because many leading GPAI models are developed by companies headquartered outside the EU but are placed on the EU market or made accessible to EU users, the Article 53 obligations have global reach. Providers based in the United States, China, and elsewhere are having to adapt their data governance and licensing strategies to satisfy European requirements. This extraterritorial effect is accelerating the emergence of a de facto global standard for training data transparency and copyright diligence in the foundation model sector.

In response, some providers are adopting a “license-first” approach for new model generations, while others are investing in technical solutions for dataset filtering, provenance tracking, and synthetic data generation as partial substitutes for scraped real-world data. Rights-holder organisations, for their part, are developing standardised licensing frameworks and collective management proposals tailored to AI training use cases. The coming months will reveal which models of rights clearance prove most workable at the massive scale required for frontier AI systems.

Conclusion

The May 2026 Digital Omnibus agreement’s decision to maintain the August 2026 enforcement deadline for Article 53 GPAI obligations has transformed what was previously a somewhat abstract regulatory horizon into an immediate compliance and licensing imperative. By requiring General-Purpose AI providers to implement copyright compliance policies and to publicly disclose detailed summaries of their training datasets, the EU has created powerful incentives for the formalisation and expansion of licensing markets for text and data used in AI training.

The reforms do not resolve all the difficult questions surrounding the relationship between copyright exceptions, contractual licensing, and large-scale AI training. They do, however, shift the default position from one of widespread informal scraping toward one in which providers must actively account for the rights they use and must make that accounting visible. For licensors, licensees, and policymakers alike, the coming period will be defined by rapid experimentation in licensing structures, due diligence practices, and disclosure methodologies. The EU’s insistence on the original timeline has ensured that these experiments will occur under real commercial and regulatory pressure rather than in a prolonged grace period.

Author:- Amrita Pradhanin case of any queries please contact/write back to us at support@ipandlegalfilings.com or   IP & Legal Filing.

References

  1. Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024 Laying Down Harmonised Rules on Artificial Intelligence (Artificial Intelligence Act), O.J. L 1689/2024, arts. 51–56, https://eur-lex.europa.eu/eli/reg/2024/1689/oj
  2. Regulation (EU) 2024/1689, art. 53(1)(c)–(d), https://eur-lex.europa.eu/eli/reg/2024/1689/oj.
  3. European Commission, General-Purpose AI Obligations Under the AI Act, https://digital-strategy.ec.europa.eu/en/factpages/general-purpose-ai-obligations-under-ai-act.
  4. Directive (EU) 2019/790 of the European Parliament and of the Council of 17 April 2019 on Copyright and Related Rights in the Digital Single Market (DSM Directive), arts. 3–4, https://eur-lex.europa.eu/eli/dir/2019/790/oj.
  5. Regulation (EU) 2024/1689, art. 53(1)(c), https://eur-lex.europa.eu/eli/reg/2024/1689/oj.
  6. Department of Enterprise, Tourism and Employment (Ireland), Provisional Agreement Reached on the Digital Omnibus on AI (7 May 2026), https://www.gov.ie/en/department-of-enterprise-tourism-and-employment/press-releases/provisional-agreement-reached-on-the-digital-omnibus-on-ai/.
  7. European Commission, Proposal for a Regulation Amending Certain Requirements Under the Digital Omnibus Package Concerning Artificial Intelligence (COM/2025/836), https://eur-lex.europa.eu.
  8. World Intellectual Property Organization (WIPO), Generative Artificial Intelligence and Copyright: Legal and Policy Considerations (2024), https://www.wipo.int.
  9. Organisation for Economic Co-operation and Development (OECD), Copyright, Data Access and Artificial Intelligence (2024), https://www.oecd.org.
  10. European Commission, Shaping Europe’s Digital Future: Artificial Intelligence Act Implementation Materials, https://digital-strategy.ec.europa.eu.
  11. International Federation of Reproduction Rights Organisations (IFRRO), Licensing Solutions for Artificial Intelligence Training Uses (2025), https://www.ifrro.org.
  12. Le Monde, AI Act: 38 Global Creators’ Organizations Condemn “Betrayal” of Europe’s Stated Goals (Aug. 2, 2025), https://www.lemonde.fr/en/economy/article/2025/08/02/ai-act-38-global-creators-organizations-condemn-betrayal-of-europe-s-stated-goals_6744006_19.html.