Improving Regulatory Oversight of Medical AI
Consider a technology that can identify pediatric brain tumors faster than a human. That future has arrived. Two-thirds of health care professionals reported incorporating artificial intelligence (AI) into their work in a recent American Medical Association study. However, many clinicians also expressed the need for increased oversight of medical AI.
The Need for Enhanced Oversight
In a recent article, it has been argued that current oversight of medical AI places too much reliance on a “human in the loop” approach, which requires individual clinicians to review and incorporate each AI recommendation in a safe and effective manner. This reliance is problematic as most clinicians lack the expertise or time to evaluate medical AI thoroughly. Policymakers are called upon to develop regulatory structures that can promote safe and effective medical AI without depending on individual clinicians.
Design Flaws and Bias in Medical AI
There are inherent design flaws that necessitate oversight. One notable example is an algorithm trained to detect pneumonia from X-rays that performed well at its development hospital but failed at others. The algorithm had learned to detect features that reflected the original hospital’s physical environment rather than patients’ symptoms.
Bias also compounds these design flaws. Alarmingly, fewer than five percent of AI systems approved by the Food and Drug Administration (FDA) between 2012 and 2020 disclosed racial demographics in their datasets, raising the risk that unrepresentative training data could worsen outcomes for minority racial groups.
Current Regulatory Frameworks
Current regulatory frameworks address these design flaws and biases through two layers of governance—central and local. At the central level, federal agencies and medical organizations develop nationwide frameworks and rules to monitor and test medical AI. For instance, the FDA requires many AI systems to meet safety and efficacy standards before they can enter the market.
However, current FDA guidance emphasizes that a product’s AI recommendations should equip clinicians with “patient-specific information” so that clinicians can review the basis for each recommendation and apply their own judgment when making final decisions. This places significant governance responsibility on individual clinicians.
Local Governance Challenges
Local governance shifts oversight to hospitals that test AI systems within their own clinical environments. However, this requires substantial expertise, infrastructure, and staff resources—requirements that many smaller or underfunded hospitals cannot meet without leaning on their clinicians.
Clinicians often play a “corrective role”—checking errors, adjusting for situation-specific factors, and identifying bias. Unfortunately, studies have found that clinicians frequently fail to identify flawed or biased models, even when provided with explanatory tools to make issues visible.
Barriers to Effective Oversight
Clinicians’ difficulty in overseeing medical AI is attributed to gaps in knowledge. Many clinicians lack familiarity with general AI principles or systematic bias. Although this issue may improve over time as medical schools integrate AI lessons into their curriculums, the tendency to trust AI recommendations and ignore contradictory information—known as automation bias—will continue to cloud their ability to evaluate the accuracy of individual AI recommendations in high-pressure situations.
Moreover, the rushed nature of the modern healthcare system often leaves little room for the careful oversight that current governance frameworks hope to achieve.
Proposed Solutions
In the short term, when clinicians remain involved, regulators and hospitals should define clear and limited roles for them. This approach ensures that oversight of medical AI does not compromise clinicians’ other responsibilities or set them up for failure. Additionally, institutional support such as onboarding, training, and ongoing monitoring can assist clinicians in overseeing safe and effective use of AI technology.
In the long term, a framework is envisioned where medical AI functions without constant clinician oversight. Regulators and medical organizations should evaluate AI systems as independent tools capable of operating in settings with limited medical expertise. This reduces dependence on overextended clinicians and can democratize medical expertise.
To facilitate this transition, it is urged that the FDA and medical organizations adopt evaluation methods to test AI performance with minimal oversight by clinicians. Once medical AI products reach the market, experts can conduct “spot-checking” or audit random samples over time to ensure that programs run as intended.
Conclusion
Recognizing the challenges ahead, it is concluded that medical AI will only improve health outcomes if oversight frameworks anticipate clinical limitations rather than assume perfect performance by clinicians.