In Browser
	StumbleUpon
	del.icio.us
	Google
	Google Buzz
	reddit
	LinkedIn

	Facebook
	Twitter
	Linkedin
	E-Mail

Generative AI > Google Gemini API > Deploying ADK Agents to Vertex AI Agent Engine

Deploying ADK Agents to Vertex AI Agent Engine

Author: Venkata Sudhakar

When your ADK agent is tested and ready for production, Vertex AI Agent Engine is the recommended deployment target. Agent Engine is a fully managed Google Cloud service that handles infrastructure provisioning, automatic scaling, session persistence across requests, and security - you focus on agent logic, Google handles everything else. Your ADK agent code runs unchanged in Agent Engine - the only difference is the AdkApp wrapper and deployment commands. Once deployed, Agent Engine exposes your agent via a managed API with IAM authentication, audit logging, and per-project quota management.

Deployment uses the vertexai Python SDK. You wrap your ADK agent in an AdkApp class, call vertexai.agent_engines.AdkApp.create() to deploy it to Agent Engine, and receive a remote_app handle. You then interact with it using remote_app.stream_query() exactly as you would call runner.run() locally - the interface is identical. Agent Engine automatically uses cloud-based managed sessions so conversation history persists across multiple API calls to the same session, even when running on distributed infrastructure.

The below example deploys the ShopMax customer service agent from Tutorial 312 to Vertex AI Agent Engine - showing the full deployment workflow from local testing to cloud production and how to query the deployed agent.

# pip install google-adk google-cloud-aiplatform vertexai
# Prereq: gcloud auth application-default login
# Prereq: gcloud config set project your-gcp-project

import vertexai
from vertexai.agent_engines import AdkApp
from google.adk.agents import Agent

vertexai.init(project="your-gcp-project", location="us-central1")

# Same agent definition as Tutorial 312 - no changes needed
def get_order_status(order_id: str) -> dict:
    orders = {
        "ORD-88421": {"status": "Out for delivery", "eta": "Today 7pm"},
        "ORD-55987": {"status": "Delivered", "date": "30 March 2025"}
    }
    return orders.get(order_id.upper(), {"error": "Order not found"})

def check_availability(product_name: str, city: str = "all") -> dict:
    stock = {
        "iphone 15 pro": {"Mumbai": 12, "Delhi": 7, "Bangalore": 15},
        "airpods pro":   {"Mumbai": 0, "Delhi": 0, "Bangalore": 2}
    }
    s = stock.get(product_name.lower(), {})
    return {"available": any(v > 0 for v in s.values()), "stock": s}

agent = Agent(
    model="gemini-2.0-flash",
    name="shopmax_support",
    instruction=(
        "You are a helpful ShopMax India customer service agent. "
        "Use tools to answer order and product queries with specific details."
    ),
    tools=[get_order_status, check_availability]
)

# Wrap in AdkApp for Agent Engine deployment
app = AdkApp(agent=agent)

# Test locally first - identical interface to runner.run()
print("Testing locally...")
import asyncio
async def test_local():
    async for event in app.async_stream_query(
        user_id="test-user",
        message="Where is order ORD-88421?"
    ):
        if "text" in str(event):
            print("Local:", event)

asyncio.run(test_local())

Deploying to Vertex AI Agent Engine and querying the live deployment,

# DEPLOY to Vertex AI Agent Engine
print("Deploying to Agent Engine...")
remote_app = AdkApp.create(
    agent=agent,
    display_name="shopmax-customer-support",
    description="ShopMax India customer service agent v1.0"
)
print("Deployed! Resource name:", remote_app.resource_name)

# QUERY the deployed agent - same interface as local testing
print("\nQuerying deployed agent...")
import asyncio

async def query_deployed():
    questions = [
        ("Where is my order ORD-88421?",      "user-001", "session-001"),
        ("Is iPhone 15 Pro in stock in Delhi?", "user-002", "session-002")
    ]
    for question, user_id, session_id in questions:
        print("Customer:", question)
        async for event in remote_app.async_stream_query(
            user_id=user_id,
            session_id=session_id,
            message=question
        ):
            content = event.get("content", {})
            for part in content.get("parts", []):
                if "text" in part:
                    print("Agent:   ", part["text"])
        print()

asyncio.run(query_deployed())

# UPDATE the agent (re-deploy after making changes)
updated_agent = Agent(
    model="gemini-2.0-flash",
    name="shopmax_support_v2",
    instruction=(
        "You are a ShopMax India customer service agent. "
        "Use tools to answer queries. "
        "Always end with: Is there anything else I can help you with?"
    ),
    tools=[get_order_status, check_availability]
)
remote_app.update(agent=updated_agent)
print("Agent updated to v2 with zero downtime")

It gives the following output for the deployment and queries,

Testing locally...
Local: {"author": "shopmax_support", "content": {"parts": [{"text": "Your Samsung 4K TV..."}]}}

Deploying to Agent Engine...
Deployed! Resource name: projects/my-project/locations/us-central1/reasoningEngines/1234567890

Querying deployed agent...
Customer: Where is my order ORD-88421?
Agent:    Your Samsung 4K TV from order ORD-88421 is out for delivery and
          expected today by 7pm. Watch for an SMS when it is nearby!

Customer: Is iPhone 15 Pro in stock in Delhi?
Agent:    Yes, iPhone 15 Pro has 7 units available in Delhi.
          You can order online for same or next-day delivery.

Agent updated to v2 with zero downtime

# Local testing and production use identical code - only .create() changes
# Agent Engine handles: scaling, sessions, IAM, audit logs, health checks
# Session state persists across API calls - same user/session = same conversation

The ADK CLI provides a simple command-line deployment workflow,

Deploying agent to Vertex AI Agent Engine...
Uploading agent code...
Provisioning managed infrastructure...
Agent deployed successfully!

Resource: projects/my-project/locations/us-central1/reasoningEngines/1234567890
Endpoint:  https://us-central1-aiplatform.googleapis.com/v1/...
Status:    ACTIVE

# Agent is now live, scaling from 0 to 1000s of concurrent users automatically

Agent Engine deployment checklist: test your agent thoroughly with the ADK eval framework (Tutorial 318) before deploying. Use a staging deployment in a separate GCP project before promoting to production. Set up Cloud Monitoring alerts on error rates and latency - Agent Engine emits standard GCP metrics. For multi-region deployments, deploy to multiple regions and use a load balancer - Agent Engine does not do cross-region routing automatically. Each deployed agent version has a unique resource name; keep the last 2-3 versions available so you can roll back instantly if a new deployment has quality issues.

Send your comments, suggestions or queries regarding this site to [email protected].