26/05/2026

The Single Biggest Risk of Generative AI: Why Hallucinations Outweigh Every Other Failure Mode

Key takeaways

Most rankings list eight to twelve risks of generative AI. The keyword asks for one, and one risk does structurally dominate: hallucination, the official US term for which is confabulation.
NIST AI 600-1 names confabulation as one of twelve generative-AI-specific risk categories and treats it as a structural property of the technology, not a temporary defect.
Recent research shows that scaling training data does not eliminate hallucinations: GPT-4o and Claude 3.7 still produce 15 to 20 percent invented citations on factual tasks, rising to 35 to 55 percent on niche or recent topics.
Under the EU AI Act, Articles 50 and 51 to 55 set provider and deployer obligations that map directly onto the hallucination problem (transparency, synthetic-content marking, model evaluation, copyright summary).
Every other commonly cited risk (bias, IP infringement, data leakage, deepfakes) compounds the same root cause: outputs whose epistemic status the user cannot verify.

Why “one major risk” is the wrong question and the right one at the same time

Search “what is one major risk associated with generative AI models” and Google answers with lists of eight, ten, twelve risks. SentinelOne enumerates ten security risks. AIMultiple lists ten. IBM lists ten. ICAEW lists four. The user asked for one. The rankings answer with many. The listicle reflex is not random. It is a defensive posture from vendors who want to look comprehensive and audit firms whose business depends on selling broad coverage. It is also an analytical category error. A list with ten items implies that the items are commensurable, that each could be mitigated independently, and that ranking them is a matter of taste. None of that is true for generative AI. MIT Sloan offers a more useful frame. It distinguishes embedded risks, which are inherent to the technology, from enacted risks, which depend on how a deploying organisation uses the system (MIT Sloan). Embedded risks survive every deployment choice. Enacted risks can be designed away with controls and process. The dominance test for “one major risk” should therefore ask: which risk is fully embedded, which cannot be designed away, and which any user-facing deployment must inherit? Only one candidate passes that test. Bias can be mitigated through data curation. IP exposure can be reduced through training-data provenance. Prompt injection can be defended at the application layer. None of those failure modes is structural to the model itself. Hallucination is.

The answer: hallucination (confabulation) is the dominant risk

In the US regulatory vocabulary, the failure mode is called confabulation. NIST AI 600-1, the Generative AI Profile published in July 2024 as a cross-sectoral supplement to the AI Risk Management Framework 1.0, lists twelve risks unique to or exacerbated by generative AI: CBRN information or capabilities, confabulation, dangerous or violent or hateful content, data privacy, environmental impacts, harmful bias and homogenisation, human-AI configuration, information integrity, information security, intellectual property, obscene or degrading or abusive content, and value chain and component integration (NIST AI 600-1). Confabulation sits at the head of that list for a reason. Every other category in the profile interacts with it. A 2025 arXiv paper by Charles Rathkopf makes the structural argument explicit: “any generative model that aims to produce complex, structured data will sometimes produce hallucinations, and even massive increases in the size of training data will not make hallucinations go away” (Rathkopf, 2025). The errors are not inherited from the training set. They are produced by the model itself. The same paper documents the operational scale of the problem. GPT-4o and Claude 3.7 exhibit 15 to 20 percent hallucination rates on factual citation tasks, climbing to 35 to 55 percent on niche or recent topics. A separate analysis of 4841 papers accepted to NeurIPS 2025 identified at least 100 confirmed hallucinated citations spanning 53 papers, roughly one percent of accepted work in a venue with extensive peer review. A companion 2025 paper, Beyond Accuracy: Rethinking Hallucination and Regulatory Response in Generative AI, argues that current frameworks have inherited too narrow a view of hallucination and proposes a layered framing: epistemic instability inside the model, user misdirection at the interface, and social-scale effects when hallucinated outputs propagate through search, journalism, scientific literature, and decision support (arXiv 2509.13345). Each layer multiplies the impact of the layer below. Why this risk dominates: every downstream harm associated with generative AI presupposes that outputs can be trusted at face value. A defamation case starts when an LLM invents a criminal record for a real person. A regulatory non-disclosure begins when an analyst pastes hallucinated figures into a 10-K filing. A clinical misdirection occurs when a triage assistant fabricates a drug interaction. None of these failures happen if the user can verify the output, and the verification gap is exactly what confabulation creates.

Confabulation vs lying vs bias

Confabulation is structural, not adversarial. The model is not deceiving anyone. It is generating tokens with high local plausibility because that is what the next-token objective rewards. That is different from prompt injection, where an external party deliberately steers the model away from its intended behaviour, and different from bias, which is a statistical property of training data and decision boundaries. Confabulation can coexist with both. A confabulating model can be biased and can be exploited at the same time, and the bias and the exploit will both be easier to land because users cannot reliably distinguish authoritative output from invented output.

How regulators treat this risk

The convergence is striking. Three of the most consequential governance regimes (the EU AI Act, NIST AI 600-1, and ISO/IEC 42001) all treat hallucination, transparency, and verifiability of generative outputs as central design constraints, even when they use different vocabulary.

EU AI Act: Article 50 transparency obligations

Article 50 of the EU AI Act sets transparency duties for AI systems intended to interact with natural persons and for systems that generate synthetic content. Providers must mark output as artificially generated in a machine-readable format. Deployers of AI systems generating or manipulating image, audio, or video content that constitutes deepfakes must disclose that the content has been artificially generated or manipulated. These obligations exist precisely because regulators concluded that generative outputs cannot be trusted to identify themselves, which is the operational definition of the confabulation risk transposed into law (European Parliament, EU AI Act overview). The Commission has been explicit. Under the Digital Services Act, it issued formal requests for information to six Very Large Online Platforms and two Very Large Online Search Engines about mitigation measures for generative AI risks. The list of risks the Commission named begins with “hallucinations where AI provides false information” and continues with viral deepfakes, automated manipulation of services and electoral processes, illegal-content dissemination, fundamental-rights infringements, and IP exposure (Commission RFI on GenAI risks, 2024).

EU AI Act: GPAI obligations (Articles 51 to 55)

General-purpose AI models receive their own chapter in the Act. Article 51 distinguishes ordinary GPAI from GPAI with systemic risk. Article 53 obliges providers of all GPAI models to publish technical documentation, an acceptable-use policy, and a sufficiently detailed summary of the content used for training (Article 53(1)(d)). Article 55 adds model evaluation, adversarial testing, serious-incident reporting, and cybersecurity requirements for GPAI models classified as systemic. The training-data summary and adversarial-testing duties are recognitions that opaque generation, where outputs cannot be traced to specific provenance, is a regulatory hazard. Most provisions of the Act apply on 2 August 2026. Prohibitions and AI literacy provisions have applied since 2 February 2025.

NIST AI 600-1: confabulation as one of twelve categories

As noted above, NIST AI 600-1 places confabulation alongside CBRN, data privacy, harmful bias, information integrity, information security, intellectual property, and value chain integration. The profile provides over two hundred recommended actions mapped to the four functions of the base AI RMF (Govern, Map, Measure, Manage). For confabulation specifically, the recommended actions cluster around grounding outputs in verifiable sources, communicating uncertainty to users, and structuring human oversight at the interface layer.

ISO/IEC 42001: operational response

ISO/IEC 42001:2023, the first international standard for AI Management Systems, gives organisations a certifiable management-system frame. Clause 6.1.3 (risk treatment) and Annex A controls on AI system impact assessment, data quality, and human oversight operationalise the response to confabulation as a documented control. The standard mandates AI Impact Assessments for high-impact systems and requires documented identification, assessment, and mitigation of bias, accountability, data protection, human oversight, explainability, and fairness (ISO/IEC 42001:2023). A certified AIMS gives an organisation a defensible answer to the question “how do you manage confabulation in production?” because the standard turns the question into auditable evidence rather than ad-hoc assurances.

Why every other commonly cited risk depends on this one

The listicles are not wrong about the items they list. They are wrong about the structure. Bias, IP, data leakage, and deepfakes are real and consequential. They are also, in different ways, downstream of the verifiability gap that confabulation creates.

Bias

NIST SP 1270 categorises AI bias into three layers: computational bias from data and modelling choices, systemic bias from broader institutional patterns, and human bias in interpretation and use (NIST SP 1270). Bias becomes a deployment problem when users cannot verify whether a given output reflects the underlying evidence or a statistical artefact. If you cannot tell whether the answer is invented, you cannot tell whether it is biased either. The two failure modes share the same diagnostic gap.

IP and copyright infringement

Generative models trained on copyrighted material can produce outputs that reproduce or closely paraphrase the source. Article 53(1)(d) of the AI Act requires providers to publish a sufficiently detailed summary of training data, which is a transparency control aimed at the same root problem: outputs whose lineage to source material cannot be traced. The Code of Practice on marking and labelling of AI-generated content sits in the same lineage of provenance-restoration controls.

Data leakage and prompt injection

NIST AI 100-2 E2025, the 2025 update of the official US taxonomy of adversarial ML attacks, classifies attacks across predictive AI and generative AI. The OWASP AI Exchange treats prompt injection as a distinct category but documents the leverage adversaries gain when they can shape generative outputs that users will treat as authoritative. Prompt injection is dangerous because users do not have a reliable way to distinguish an attacker-steered response from a legitimate one, which is again the confabulation problem expressed in a security frame.

Deepfakes

Synthetic image, audio, and video content produced by generative models is the politically visible face of the confabulation risk. Article 50(4) of the AI Act mandates disclosure when deployed content constitutes a deepfake. The European Commission has issued a Code of Practice on marking and labelling of AI-generated content. These are not separate regulatory interventions targeted at separate risks. They are the same intervention applied to different output modalities.

What this means for deployers (the playbook)

A deploying organisation cannot eliminate hallucination at the model level. It can, however, design its deployment so that hallucinated outputs are caught before they reach a user or a regulator. Four layers, each anchored to a named obligation or control:

AI system impact assessment (ISO/IEC 42001 Annex A.6, EU AI Act Article 27 for fundamental-rights impact). Before deploying a generative system on a user-facing surface, document what kinds of outputs the system will produce, which categories of users will see them, and where a confabulated output would create the largest concrete harm. This artefact becomes the entry point for every downstream control.
Disclosure and content provenance (EU AI Act Article 50). Mark machine-generated outputs as such. Apply content provenance metadata (C2PA or equivalent) on synthetic media. Where the model is deployed in a chat interface, the disclosure should be on screen, not buried in a terms-of-service link.
Human-in-the-loop verification on user-facing outputs (EU AI Act Article 14 for high-risk, Article 26 deployer obligations; NIST AI 600-1 Manage function). For outputs that influence regulated decisions (credit, insurance, hiring, clinical triage, legal advice), require a human reviewer with the authority and skill to override. The reviewer’s effectiveness depends on tooling that surfaces uncertainty, not just text.
Incident logging and post-market monitoring (EU AI Act Article 72; ISO/IEC 42001 Clause 9). Log confabulation events the way a software vendor logs production incidents. Track frequency, impact, and mitigations applied. Feed the log back into prompt engineering, retrieval grounding, and retraining choices.

A platform like AI Sigil operationalises these layers as a single management-system surface: the impact assessment, the disclosure record, the human-oversight workflow, and the incident log become first-class objects rather than scattered artefacts.

Where this is heading: 2026 horizon

Four signals converge on the back half of 2026. First, the bulk of the EU AI Act applies on 2 August 2026. GPAI obligations attach. Member-state market surveillance authorities begin formal enforcement against providers and deployers that fail to mark synthetic content, document training data, or operationalise risk-management duties. Second, the Commission’s DSA enforcement on generative AI risks (already underway with formal RFIs in 2024) is shifting from information gathering to substantive obligations. Platform compliance officers will be asked to demonstrate, with measurement, that hallucinated outputs do not influence electoral conversations or the dissemination of illegal content. Third, the academic literature is repositioning hallucination from a technical curiosity into an epistemic safety problem. The Beyond Accuracy paper’s layered framing (epistemic instability, user misdirection, social-scale effects) is becoming a reference point for regulators and auditors. Fourth, certifications and standards are catching up. ISO/IEC 42001 certifications are scaling. Organisations that built their generative-AI controls around the hallucination risk will find their certification work load-bearing under whatever comes next. Organisations that built their controls around a generic listicle will find themselves re-doing the work.

FAQ

What does “hallucination” mean in generative AI? Hallucination is the failure mode where a generative model produces output that is plausible-sounding but factually wrong or invented. NIST uses the technical term confabulation. The output is not the product of a software bug or a database lookup error. It is generated by the model in the same way as every other token: by selecting tokens with high local probability given the context. The model does not signal which parts of its output are reliable, which is what makes the failure dangerous. Is hallucination the same as bias? No. Bias is a statistical property of how a model’s decisions distribute across groups, contexts, or modalities. Hallucination is a content-level failure where the model produces information that does not correspond to any underlying source. The two can coexist. A confabulating model can be biased. But mitigating bias (through data curation, fairness testing, calibration) does not mitigate confabulation, and vice versa. Can hallucinations be eliminated by training on more data? No. The 2025 arXiv paper Hallucination, reliability, and the role of generative AI in science is explicit that even massive increases in training data will not eliminate hallucination, because the errors are produced by the model’s generation process itself, not merely inherited from the training set. Engineering controls (retrieval-augmented generation, output grounding, uncertainty estimation, human review) can reduce frequency and impact, but no current technique removes the failure mode at the model layer. What does the EU AI Act say about hallucinations specifically? The Act does not use the word hallucination. It addresses the underlying problem indirectly through Article 50 (transparency and synthetic-content marking), Articles 51 to 55 (GPAI obligations including training-data summaries, technical documentation, and systemic-model evaluation), and Article 26 (deployer responsibilities). The European Commission, when describing the risks in its DSA enforcement, has named hallucinations as the first item. How can a deployer reduce hallucination risk in production? Four layers, applied together: a documented AI system impact assessment that names the harm scenarios; disclosure and content-provenance marking on outputs; human-in-the-loop verification on outputs that influence regulated decisions; and incident logging tied to post-market monitoring. None of these eliminate hallucination, but together they convert it from an open exposure into a managed risk with documented controls and an audit trail.

Conclusion

The SERP answers “what is one major risk associated with generative AI models” with a list of ten. The honest answer is one. Hallucination, called confabulation in the US vocabulary, is the dominant risk because it is embedded in the technology rather than enacted by the deployer, because the regulators and the standards converge on it, because the 2025 research shows scaling does not solve it, and because every other commonly cited risk compounds the same verifiability gap. The work for any organisation deploying generative AI is to convert that gap into a managed control surface: impact assessment, disclosure, human oversight, incident logging. Done properly, it survives the 2 August 2026 enforcement deadline. Done as a listicle, it does not. For a deeper look at how an AI governance platform structures these controls, see the AI Sigil platform overview and the broader Industry Insights resources.

Dr. Sarah Mitchell

AI Governance & Risk Lead at AI Sigil. Sarah holds a PhD in Computer Science from Imperial College London with a thesis on machine learning model interpretability. Before joining AI Sigil she spent six years in Deloitte's AI Risk Advisory practice, leading high-risk AI system audits across financial services and healthcare. She is IAPP AIGP certified and a frequent contributor to industry working groups on the EU AI Act and NIST AI Risk Management Framework. Based in London.

Shadow AI: Why Hidden AI Use Is a Governance Problem

Shadow AI is unsanctioned AI use that breaks EU AI Act, ISO 42001 and NIST RMF inventory mandates. How to discover and register it.

Dr. Sarah Mitchell
May 26, 2026