|
|
ADK Agent Stress Testing Under Sustained Load
Author: Venkata Sudhakar
ShopMax India's ADK agents handle thousands of concurrent customer requests during peak sale events in Mumbai, Delhi, and Bangalore. Sustained load testing simulates hours of high-traffic pressure rather than a single burst, revealing latency degradation and error rate increases that only appear after the agent has been running under load for several minutes.
The test launches multiple asyncio workers that call the agent concurrently through a semaphore, collecting per-request latency and error flags over a timed window. After the run, p95 latency and error rate are calculated from the collected results. A p95 above 1.5x the baseline or an error rate above 1% flags a failing test.
The example below runs a 3-second sustained stress test with 3 concurrent workers against a mock agent and asserts that p95 latency and error rate stay within acceptable bounds.
It gives the following output,
Total requests: 178
p95 latency: 52.3ms
Error rate: 0.0%
. (1 passed in 3.08s)
In production, replace mock_agent_call with a real ADK runner call and set duration_seconds to at least 60 for meaningful results. Tune BASELINE_LATENCY_MS using values observed during normal traffic in your staging environment. Run the sustained stress test as a nightly CI job so that gradual latency regressions introduced by model updates or new tool calls are caught before they affect customers.
|
|