Deploying Responsible AI with Vertex AI and Gemini Models

From Zero to Vertex AI Invoke Gemini using Responsible AI Principles

This article serves as a tutorial on deploying a FastAPI application to Google Cloud Run that invokes Gemini models via Vertex AI while adhering to responsible AI principles.

Introduction

The guide illustrates how to configure safety filters for four harm categories: dangerous content, harassment, hate speech, and sexually explicit content, using strict blocking thresholds. It utilizes Vellox as an adapter for running ASGI applications in Google Cloud Functions and implements Bearer token authentication for enhanced security.

Moreover, the tutorial details the entire setup process, including enabling necessary Google Cloud services, configuring IAM roles, and deploying the function with environment variables.

This tutorial emphasizes practical safety implementations by demonstrating how Vertex AI screens both inputs and outputs. It returns a “SAFETY” finish reason accompanied by detailed safety ratings when harmful content is detected, which is particularly beneficial for developers aiming to build AI applications with integrated content moderation and security.

Technology Used

Cloud Run Functions:

  • Designed for quick responses to events or HTTP triggers.
  • Minimal configuration required — all infrastructure is managed for you.
  • Suitable for concise functions rather than full-fledged services.

Velox:

Vellox is an adapter that allows the execution of ASGI applications (Asynchronous Server Gateway Interface) in Google Cloud Functions.

HTTPBearer:

HTTPBearer in FastAPI is a security utility provided by the fastapi.security module. It manages Bearer token authentication, a common method for securing API endpoints by handling the presence and extraction of the Bearer token.

Steps to Implement

Development Environment Setup:

Use devcontainer to install all necessary components. Set up Docker and DevContainer, and after pulling the code, you will be ready to go.

Enable Services:

Initially, execute the following commands:

gcloud init

Followed by:

gcloud services enable artifactregistry.googleapis.com cloudbuild.googleapis.com run.googleapis.com logging.googleapis.com aiplatform.googleapis.com

IAM Permissions:

Assign the project role ‘roles/aiplatform.user’ to the current project in IAM.

Deploy with Environment Variables:

Use the following command to deploy:

gcloud run deploy fastapi-func --source . --function handler --base-image python313 --region asia-south1 --set-env-vars API_TOKEN="damn-long-token",GOOGLE_GENAI_USE_VERTEXAI=True,GOOGLE_CLOUD_LOCATION=global --allow-unauthenticated

This command:

  • Deploys a FastAPI function named handler from your local folder.
  • Runs on Python 3.13, in the Mumbai (asia-south1) region.
  • Sets environment variables for API tokens and Google Vertex AI usage.
  • Makes the function publicly available (no authentication required except for the Bearer token).

Walkthrough of main.py

The core of the implementation involves a simple FastAPI application integrated with Google Gemini AI, while also incorporating safety content filters.

import httpx, os, uuid
from fastapi import Depends, FastAPI, HTTPException, status
from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
from vellox import Vellox
from pydantic import BaseModel
from typing import Optional
from google import genai
from google.genai.types import (
GenerateContentConfig,
HarmCategory,
HarmBlockThreshold,
HttpOptions,
SafetySetting,
)

The safety_settings variable is defined as a list of SafetySetting objects, each specifying a harm category along with a block threshold. This includes:

  • Dangerous content
  • Harassment
  • Hate speech
  • Sexually explicit content

All categories are configured to block at the BLOCK_LOW_AND_ABOVE threshold, ensuring strict moderation.

Essentially, these settings allow the application to screen both inputs and outputs. If the model assesses the content as harmful, the call is blocked and no text is returned. By default, Gemini employs a severity-aware harm-block method in Vertex AI, which can be adjusted as necessary.

Conclusion

The implementation of responsible AI principles in deploying FastAPI applications using Google Cloud Run and Vertex AI is crucial in developing secure, ethical AI solutions. By leveraging these technologies, developers can effectively manage content moderation while ensuring the safety and integrity of their applications.

For further details, visit the GitHub repository.

More Insights

Rethinking AI Innovation: Beyond Competition to Collaboration

The relentless pursuit of artificial intelligence is reshaping our world, challenging our ethics, and redefining what it means to be human. As the pace of AI innovation accelerates without a clear...

Pakistan’s Ambitious National AI Policy: A Path to Innovation and Job Creation

Pakistan has introduced an ambitious National AI Policy aimed at building a $2.7 billion domestic AI market in five years, focusing on innovation, skills, ethical use, and international collaboration...

Implementing Ethical AI Governance for Long-Term Success

This practical guide emphasizes the critical need for ethical governance in AI deployment, detailing actionable steps for organizations to manage ethical risks and integrate ethical principles into...

Transforming Higher Education with AI: Strategies for Success

Artificial intelligence is transforming higher education by enhancing teaching, learning, and operations, providing personalized support for student success and improving institutional resilience. As...

AI Governance for Sustainable Growth in Africa

Artificial Intelligence (AI) is transforming various sectors in Africa, but responsible governance is essential to mitigate risks such as bias and privacy violations. Ghana's newly launched National...

AI Disruption: Preparing for the Workforce Transformation

The AI economic transformation is underway, with companies like IBM and Salesforce laying off employees in favor of automation. As concerns about job losses mount, policymakers must understand public...

Accountability in the Age of AI Workforces

Digital labor is increasingly prevalent in the workplace, yet there are few established rules governing its use. Executives face the challenge of defining operational guidelines and responsibilities...

Anthropic Launches Petri Tool for Automated AI Safety Audits

Anthropic has launched Petri, an open-source AI safety auditing tool that automates the testing of large language models for risky behaviors. The tool aims to enhance collaboration and standardization...

EU AI Act and GDPR: Finding Common Ground

The EU AI Act is increasingly relevant to legal professionals, drawing parallels with the GDPR in areas such as risk management and accountability. Both regulations emphasize transparency and require...