|
|
Mocking Gemini LLM Responses in ADK Tests
Author: Venkata Sudhakar
ShopMax India's ADK agents call the Gemini API on every conversation turn, making live tests slow, non-deterministic, and expensive during CI runs. Mocking the Gemini response lets you assert on agent behavior using controlled, repeatable outputs without consuming API quota. Tests run in under a second and produce the same result on every machine.
The ADK routes LLM calls through google.adk.models.google_llm.Gemini internally. Patching its _generate_content_async method with AsyncMock intercepts every Gemini call and returns a pre-built response object. The agent processes this mock response exactly as it would a real one, so all downstream logic - tool selection, reply formatting - executes normally under test without any live API traffic.
The example below mocks Gemini responses for a ShopMax India support agent, verifying that the reply contains the expected order ID and that exactly one LLM call fires per conversation turn.
It gives the following output,
tests/test_agent.py::test_order_reply_contains_order_id PASSED
tests/test_agent.py::test_llm_called_exactly_once_per_turn PASSED
2 passed in 0.31s
In production, use side_effect instead of return_value when a test turn triggers multiple LLM calls - for example when a tool-use agent calls Gemini once to select the tool and again to format the final reply. Store mock response fixtures in JSON files under tests/fixtures/ so test data stays separate from test logic and is easy to update when the agent instruction changes.
|
|