|
|
Testing ADK Agent Cold Start Latency
Author: Venkata Sudhakar
Cold start latency testing measures how long an ADK agent takes to serve its first request after being initialized from a cold state - with no warm cache, no preloaded models, and no active connections. ShopMax India monitors cold start latency for its Cloud Run-deployed agents because spiky traffic during flash sales in Mumbai causes scale-out events where new instances must handle real customer requests within the SLA budget from a fully cold state.
Cold start is measured by timing the first tool call after agent initialization, including connection setup, session service initialization, and any lazy-loaded configuration. Warm start is measured as the latency of subsequent calls after initialization is complete. The test asserts that cold start stays below a defined ceiling (e.g. 2000ms) and warm start stays below the normal SLA (e.g. 300ms), and prints both values so the delta is visible in CI output.
The example below simulates cold start overhead using a lazy-initialization pattern, measures first-call and subsequent-call latency, and asserts both stay within their respective thresholds.
It gives the following output,
Cold start: 51.3ms (ceiling 2000ms)
Warm call: 0.02ms (SLA 300ms)
Cold=51.4ms Warm=0.02ms Delta=51.4ms
3 passed in 0.16s
Run cold start tests in a dedicated CI job that starts a fresh process for each test to avoid shared-state contamination from previous tests. For Cloud Run agents, use the minimum-instances=1 setting to eliminate cold starts for baseline traffic, and size the cold start budget to cover the P99 scale-out latency observed during the previous peak traffic event. Profile initialization with cProfile to identify which step (config load, DB connection, model load) dominates cold start time before optimizing.
|
|