AI Security and Safety: Building Trustworthy Systems Amid New Risks

Mitigating AI’s New Risk Frontier: Unifying Enterprise Cybersecurity with AI Safety

These are exciting times for AI. Enterprises are blending AI capabilities with enterprise data to deliver better outcomes for employees, customers, and partners. However, as organizations integrate AI more deeply into their systems, the associated data and infrastructure become increasingly attractive targets for cybercriminals and other adversaries.

Generative AI (gen AI) significantly expands an organization’s attack surface, introducing new risks. This necessitates a careful evaluation of potential threats, vulnerabilities, and their implications for business operations. Deploying AI with a strong security posture, in compliance with regulations, and in a trustworthy manner requires more than patchwork defenses; it demands a strategic shift. Security must be an integral part of the entire AI strategy.

AI Security vs. AI Safety

AI security and AI safety are related yet distinct concepts, both essential for risk reduction but addressing different challenges.

AI Security

AI security focuses on protecting the confidentiality, integrity, and availability of AI systems. The objective is to prevent malicious actors from attacking or manipulating these systems. Common threats include:

Exploiting vulnerabilities in software components
Misconfigured systems or broken authentication allowing unauthorized access to sensitive enterprise data
Prompt injection or model “jailbreaking”

In essence, AI security resembles traditional IT security but is tailored for new types of systems.

AI Safety

AI safety, on the other hand, maintains the intended behavior of AI systems, ensuring alignment with company policies, regulations, and ethical standards. The risks here are not about system compromise, but about the outputs the systems generate. Safety issues include:

Harmful or toxic language
Bias or discrimination in responses
Hallucinations (plausible-sounding but false answers)
Dangerous or misleading recommendations

A lapse in safety can erode trust, cause reputational damage, and even create legal or ethical liabilities.

Examples of Security and Safety Risks

AI Security Risks

Memory safety issues: AI system components may be vulnerable to traditional memory safety issues, such as heap overflow attacks.
Insecure configurations: An exposed server without authentication may allow attackers to access sensitive information.
Broken authentication or authorization: Accessing server functionalities with incorrect credentials can lead to security breaches.
Malware: A supply chain attack may involve malicious code added to a trusted open-source model.

AI Safety Risks

Bias: The model’s responses may discriminate against specific social groups based on race, religion, gender, or other characteristics.
Hallucination: The model may produce responses that are not based on facts.
Harmful responses: The model might provide dangerous advice, for instance, in medical contexts.
Toxic language: The model may generate offensive or abusive content.

The New Risk Frontier: Demystifying “Safety”

While foundational security is critical, large language models (LLMs) introduce an additional risk domain: safety. In AI contexts, safety refers to trustworthiness, fairness, and ethical alignment.

A security failure might expose sensitive data, whereas a safety failure could produce biased hiring recommendations or toxic responses in customer interactions, undermining trust.

This unpredictability creates governance challenges. Even with safe training data, a model can still generate harmful content, and fine-tuning may unintentionally weaken built-in safeguards.

Managing these risks requires a new toolset, including:

Pre-training and alignment focused on safety
Continuous evaluation and benchmarking of safety outcomes
“LLM evaluation harnesses” to measure the likelihood of harmful outputs

Should the AI Model Get All the Attention?

AI models often attract the spotlight, but they are just one component of a broader AI system. Non-AI components (servers, APIs, orchestration layers) often make up the bulk of the system and carry their own risks.

The right approach is a defense-in-depth strategy—strengthen existing enterprise security practices, then layer on additional protections for the novel risks that generative AI introduces.

Unifying AI Security with Enterprise Cybersecurity

AI security does not require reinventing the wheel. The same principles that underpin traditional enterprise security apply here. Existing investments in cybersecurity form the foundation of an AI security posture.

Security is fundamentally about managing risk—reducing the chances of a threat materializing and minimizing its impact. Key strategies include:

Risk modeling: Identify and prioritize AI-related risks.
Penetration testing: Measure system robustness.
Continuous monitoring: Detect and respond to incidents.

The classic Confidentiality, Integrity, Availability (CIA) triad remains applicable, whether the system is AI-powered or not. To enhance AI security, enterprises should extend proven practices, such as:

Secure software development lifecycle (SDLC): Incorporate security at every phase.
Secure supply chain: Vet every model, library, and container to guard against vulnerabilities.
SIEM and SOAR tools: Monitor inputs, system stress, and potential data exfiltration.

Enterprises should integrate AI risk into existing governance frameworks, such as an Information Security Management System (ISMS), rather than creating standalone AI programs. Treating AI security as an extension of existing posture allows leveraging the same skills, tools, and processes for more effective risk management.

Moving Forward

In summary, enterprises should adopt a holistic approach to AI security and safety. Frameworks highlight the need for integrated risk management that combines security, governance, and compliance with AI-specific safeguards.

The bottom line is that AI presents vast opportunities but also unique risks. By unifying AI security with established cybersecurity practices and layering in safety measures, enterprises can build more trustworthy systems that deliver significant business value.

Conclusion

AI security and AI safety are two sides of the same coin—one focused on protecting systems from attacks, the other on ensuring responsible and trustworthy behavior. Enterprises that integrate AI security and safety into their overall risk management strategy will be best positioned to unlock AI’s business value while maintaining the trust of customers, employees, and regulators.