|
|
Load Testing ADK Agents - Simulating Concurrent User Requests
Author: Venkata Sudhakar
ShopMax India runs thousands of customer support sessions daily through ADK agents. A single agent response may perform well in isolation, but load testing reveals how the system behaves under concurrent user traffic - revealing latency spikes, resource contention, and failure rates that only appear at scale. Load testing ADK agents before a product launch prevents customer-facing outages on high-traffic days like sale events.
Load testing ADK agents uses asyncio to fire multiple parallel sessions simultaneously and measure aggregate throughput, latency, and failure rate. The key metrics are: requests per second (throughput), mean and p95 latency, and error rate under load. In unit and integration tests, the LLM is mocked to isolate the agent logic from network variability. In pre-production load tests, the real LLM endpoint is hit to measure end-to-end performance.
The example shows ShopMax India running a load test with 50 concurrent order tracking sessions. The agent call is mocked with a short simulated latency, and the test asserts that all sessions complete successfully within an acceptable time budget.
It gives the following output,
Total sessions: 50
Successes: 50
Failures: 0
Elapsed: 0.052 seconds
Throughput: 961.5 req/sec
Run mocked load tests in CI to catch regressions in agent logic under concurrency - race conditions and shared state bugs only appear with parallel execution. For pre-production load tests, use a staging environment and target 2x expected peak traffic. Set a p95 latency budget (e.g., 3 seconds for order queries) and fail the test if it is exceeded. Use asyncio.Semaphore to cap concurrency during ramp-up tests to simulate gradual traffic increases rather than an instant spike.
|
|