What Is a Frontier Model? Definition, Risks, and Rules

Key takeaways

  • A frontier model is the most capable class of general-purpose AI at the current edge of capability, such as the systems behind GPT-4 class assistants, Claude, Gemini, and Llama.
  • Frontier model is a moving target defined by capability and training compute, not by a fixed feature list. Today’s frontier becomes tomorrow’s baseline.
  • It is a subset of foundation models, and it is broader than a large language model: a frontier model can be multimodal, not just text.
  • The EU AI Act governs the most capable frontier models as general-purpose AI with systemic risk, presumed once training compute passes 10 to the power of 25 floating-point operations.
  • Providers of these models face concrete duties: evaluation and adversarial testing, systemic-risk mitigation, incident reporting, and cybersecurity.

What is a frontier model?

A frontier model is the most capable class of general-purpose AI in existence at a given moment, the handful of systems that sit at the leading edge of what AI can do. In practice the label points to models like the ones behind GPT-4 class assistants, Anthropic’s Claude, Google’s Gemini, Meta’s Llama, and Mistral’s largest releases. The term is deliberately relative: a frontier model is defined by how it compares to everything else available, so the frontier moves forward every year.

Underneath the label sits a precise legal idea. A frontier model is a kind of general-purpose AI (GPAI) model, which the EU AI Act describes as a model trained on broad data, displaying significant generality, capable of competently performing a wide range of distinct tasks, and able to be integrated into many downstream systems. A frontier model is the high end of that category: the most general, most capable, and most compute-intensive examples.

Two properties separate a frontier model from an ordinary AI system. The first is generality: it is not built for one task but adapts to many, from drafting text to writing code to analyzing images. The second is scale: frontier models are trained on enormous datasets with vast amounts of compute, which is why only a few organizations build them. Those two properties are also what make a frontier model a governance concern rather than just an engineering achievement.

A frontier model illustrated as a brass telescope on a tripod

Frontier model vs foundation model vs LLM

The three terms overlap, which is why the most common search around this topic is how they differ. The clean way to hold them apart is to treat them as three circles of decreasing size.

Frontier model vs foundation model

A foundation model is any large model trained on broad data that can be adapted to many downstream tasks. That is a wide category: it includes thousands of models of every size, open and closed, old and new. A frontier model is the small subset of foundation models at the capability frontier. Put simply, every frontier model is a foundation model, but the overwhelming majority of foundation models are not frontier models. The difference is capability and compute, not architecture.

Frontier model vs large language model

A large language model (LLM) is a model type defined by its modality: it processes and generates language. A frontier model is defined by its position at the capability edge, regardless of modality. Many frontier models are multimodal, handling text, images, audio, and more, so they are larger in scope than the LLM label suggests. And most LLMs are not frontier models, because the category includes many small and mid-sized language models. The honest answer to is ChatGPT a frontier model is that the most capable models behind it qualify, while smaller or older language models do not.

How frontier models are built

The defining input of a frontier model is compute. Training one means running an enormous number of mathematical operations over a very large dataset, a process measured in floating-point operations, or FLOP. The scale is hard to picture: GPT-3, released in 2020, was trained at roughly 3.14 times 10 to the power of 23 FLOP, and the leading models that followed pushed well beyond 10 to the power of 25.

That scale produces a property regulators care about: emergent capabilities, abilities that were not explicitly designed and that appear only once a model crosses a certain size. Emergence is why a frontier model cannot be fully specified in advance, and why testing it after training matters as much as designing it. It is also why so few organizations operate at the frontier, since the cost of compute, data, and talent is out of reach for most. For a governance team, the practical takeaway is that a frontier model is a third-party dependency you rarely build and usually consume, which is exactly the kind of dependency an AI registry exists to track.

Why frontier models carry systemic risk

The same generality that makes a frontier model useful also makes its failures broad. A narrow model that misfires affects one task; a frontier model integrated into thousands of downstream products can spread a single weakness across the economy. Regulators call this systemic risk: a risk specific to the high-impact capabilities of the most powerful models, with the potential for significant effects on public health, safety, security, or fundamental rights.

The risks are not hypothetical. The NIST Generative AI Profile, NIST AI 600-1, catalogs twelve risks that frontier generative systems amplify, including CBRN information uplift, confabulation, information integrity and synthetic media, information security such as prompt injection, and dual-use concerns. Many of these, such as hallucination, behave nothing like traditional software bugs, which is why frontier models need their own AI governance treatment rather than a reused IT-security checklist.

How the EU AI Act governs frontier models

The EU AI Act is the first major law to regulate frontier models directly, and it does so through the category of general-purpose AI with systemic risk. Rather than name specific products, the EU AI Act sets a measurable trigger: a GPAI model is presumed to carry systemic risk when the cumulative compute used for its training exceeds 10 to the power of 25 FLOP (Article 51). GPT-4 class models sit above that line; GPT-3, at roughly 3 times 10 to the power of 23, sits well below it. The threshold is a proxy for high-impact capability, and the Commission can designate models that fall under it by other criteria.

Crossing the threshold turns on a set of obligations under Article 55. Providers of frontier models with systemic risk must evaluate the model, including adversarial testing or red-teaming; assess and mitigate systemic risks; track and report serious incidents to the EU AI Office; and ensure an adequate level of cybersecurity. They must also notify the AI Office within two weeks of meeting the threshold, and serious-incident reports follow tight deadlines once an incident is known.

The operational detail lives in the GPAI Code of Practice, published by the EU AI Office in July 2025, whose Safety and Security, Transparency, and Copyright chapters translate the legal duties into documentation and process that providers can actually follow. For an organization deploying a frontier model rather than building one, the practical move is to confirm the provider’s status and to attach the relevant obligations to that model inside your own program, the work an AI framework activation platform is designed to carry.

Frontier AI safety beyond the EU

Europe is not alone in singling out frontier models, though the approaches differ. In the United States, a 2023 executive order introduced a compute-reporting threshold of 10 to the power of 26 FLOP for dual-use foundation models, requiring developers to report large training runs; US policy specifics have since shifted with changes in administration, so the reporting regime is best treated as evolving rather than settled.

Industry has also moved on its own. The Frontier Model Forum, founded by several of the largest developers, coordinates research on frontier AI safety, capability assessments, and shared standards, and individual labs publish frontier safety frameworks that commit them to pause or add safeguards when a model crosses defined capability levels. These voluntary efforts are not law, but they shape the baseline expectations a governance team should hold a frontier-model vendor to.

What frontier models mean for your governance program

For most organizations the relevant question is not how to build a frontier model but how to govern the ones they use. The starting point is visibility. Frontier models usually enter an organization through products and APIs, often via shadow AI that no one formally approved, so the first task is to inventory which frontier models sit inside your tools and workflows.

From there the path is the same risk loop any serious program runs. Record each frontier model as a dependency, attach the obligations that apply (EU AI Act duties where relevant, plus your internal policies), measure the risks that matter for your use case, and monitor for change as the provider ships new versions. The frontier moves, your inventory should move with it, and continuous AI risk management keeps the evidence current rather than frozen at the moment of a single approval.

FAQ

What is a frontier model in simple terms? A frontier model is the most capable kind of general-purpose AI available at a given time, the small set of systems at the leading edge of what AI can do. They are trained on huge datasets with vast compute, which makes them highly general but also expensive and few in number.

Is ChatGPT a frontier model? The most capable models behind ChatGPT, such as the GPT-4 class systems, are generally considered frontier models. Smaller or older language models, including some that also power chat interfaces, are not, because the term tracks the capability frontier rather than the product name.

What is the difference between a frontier model and a foundation model? A foundation model is any large model trained on broad data and adaptable to many tasks, a very wide category. A frontier model is the small subset of foundation models at the capability frontier. Every frontier model is a foundation model, but most foundation models are not frontier models.

What is a frontier model vs an LLM? A large language model is defined by modality, since it works with language. A frontier model is defined by capability and can be multimodal. Many frontier models are more than LLMs, and most LLMs are not at the frontier.

Does the EU AI Act regulate frontier models? Yes. The EU AI Act regulates the most capable frontier models as general-purpose AI with systemic risk, presumed once training compute passes 10 to the power of 25 FLOP. Providers then face obligations around evaluation, risk mitigation, incident reporting, and cybersecurity.

Why are frontier models considered risky? Their generality means a single weakness can spread across the many products that build on them, which is what regulators call systemic risk. They also display emergent capabilities that appear only at scale and cannot be fully predicted before training, so they need testing and monitoring rather than a one-time review.

Conclusion

A frontier model is best understood as a position, not a product: the most capable general-purpose AI at the current edge, a small subset of foundation models and something broader than a language model. That position is exactly what makes it a governance subject. The same scale that produces emergent capability also produces systemic risk, which is why the EU AI Act regulates frontier models directly and why industry safety frameworks have grown up around them. For an organization, the work is not to chase the definition but to know which frontier models you depend on and to attach the right obligations to each one. The fastest way to do that is to put every frontier model on the map and activate the frameworks that govern it.

Ethical AI: From Principles to an Auditable Operating Model

Ethical AI is more than a values list. See how to turn fairness, transparency and accountability into auditable controls under the EU AI Act and ISO 42001.

What Is a Frontier Model? Definition, Risks, and Rules

A frontier model is the most capable class of AI. See how it differs from foundation models and LLMs, and how the EU AI Act governs systemic risk.

Privacy Impact Assessment Meaning: PIA, DPIA, FRIA

Privacy impact assessment meaning, explained: what a PIA is, how it differs from a GDPR DPIA, and when the EU AI Act adds a fundamental rights assessment (FRIA).

NIST Risk Management Framework: Explained for AI Systems

Understand the NIST Risk Management Framework (SP 800-37), its seven steps, and how it connects to the NIST AI RMF and EU AI Act for governing AI systems.

Compliance Monitoring for AI Systems: A Practical Guide

Compliance monitoring keeps AI systems aligned with the EU AI Act, ISO 42001 and NIST AI RMF. Learn what to monitor, how often, and how to automate it.

AI Incident Reporting Under the EU AI Act (Article 73)

AI incident reporting under EU AI Act Article 73: what counts as an AI incident, who must report, the 2/10/15-day deadlines, and how to build the workflow.