AI Content Discovery: What Companies Must Know in 2026

Litigation Minute: Is AI-Generated Content Discoverable? What Companies Need to Know in 2026

As artificial intelligence tools rapidly reshape how Electronically Stored Information (ESI) is created and stored, particularly regarding content generated by large language models, companies must navigate the evolving landscape of legal discovery. This article explores the implications of generative AI (GenAI) tools on discovery obligations, emphasizing two crucial points:

Relevant GenAI Data is discoverable
Parties must treat it like any other potentially relevant ESI

Traditional Discovery Rules Still Govern Non-Traditional Data

Under FRCP 26(b)(1), parties can obtain discovery of non-privileged material that is relevant and proportional to the needs of the case. Courts have clarified that new forms of ESI are not exempt from discovery obligations simply due to their novelty. Traditional discovery principles apply equally to emerging sources of ESI, including GenAI Data.

Key Early Decisions on GenAI Data Discoverability

One of the most significant rulings regarding GenAI Data is found in the case of In re OpenAI, Inc., Copyright Infringement Litigation. In this case, Magistrate Judge Ona Wang compelled the production of millions of GenAI logs, including user prompts and model responses, provided that user references were anonymized. The court concluded that these logs were relevant and proportional to the plaintiffs’ claims that the defendant’s AI systems reproduced copyrighted works in their outputs. Privacy concerns were addressed through anonymization and protective orders, which did not categorically bar the production of AI output.

Conversely, in a separate ruling within the same litigation, Magistrate Judge Wang denied a motion to compel the New York Times to produce content from its internal AI tools, deeming the request both irrelevant and disproportionate. The New York Times argued that reviewing approximately 80,000 entries would require over 1,300 hours, a substantial burden given the limited connection to the issues at hand.

Relevance and Proportionality Still Reign

These rulings reinforce two key discovery concepts:

Relevance: GenAI Data is discoverable when it is tied to a claim or defense.
Proportionality: Even large volumes of GenAI Data can be discoverable when justified by the case’s needs, but proportionality remains a critical consideration.

GenAI and E-Discovery in Practice

As GenAI plays an increasingly integral role in daily life, it is essential for parties to be prepared to address its implications in discovery. It is rarely reasonable or proportional to preserve all GenAI Data, making it crucial to develop a targeted, reasoned, and well-documented approach from the outset.

Identify Relevant GenAI Data

Determine if any custodians of potentially relevant data are using GenAI tools, how they are employed, and where prompts and outputs are stored. Relevant activity logs may exist separately, including on third-party platforms.

Preserve What’s Potentially Relevant

When litigation is anticipated, it is vital to preserve GenAI Data that relates to claims or defenses. This is particularly important if the data contains factual assertions or substantive content. Preservation steps can include:

Disabling auto-delete settings
Exporting chat histories
Saving key exchanges in document repositories
Coordinating with IT to understand retention of logs and metadata

Custodians should avoid editing or selectively copying GenAI Data in ways that alter context and should disclose the use of personal or browser-based tools for evaluation. Specific preservation measures will vary based on the case and the systems in use.

Negotiate Scope Early

If GenAI Data is implicated, address relevance and proportionality in ESI protocols and early meet-and-confer discussions. Clear definitions and targeted limits can prevent fishing expeditions and reduce costs and burdens.

Address Confidentiality

Take privacy concerns seriously. Where feasible, use protective orders and anonymization protocols to manage sensitive information while fulfilling discovery obligations.

Update Information Governance

Incorporate GenAI Data into ESI inventories, legal hold procedures, and retention policies to enhance discovery readiness. Consider implementing AI-specific policies regarding acceptable use and data confidentiality.

Conclusion

The discoverability of GenAI Data is rapidly becoming a pivotal issue in e-discovery. Courts are not providing exemptions for GenAI Data; traditional discovery principles remain applicable. When GenAI Data is central to a dispute, it is likely to be discoverable, but proportionality will continue to serve as a meaningful limit. Companies and their litigation teams should proactively address GenAI Data in discovery planning, collaborate closely with e-discovery specialists to minimize burdens, and manage privacy concerns effectively.

Stay tuned for upcoming discussions on the intersection of GenAI Data with attorney-client privilege and work product doctrine.