In Browser
	StumbleUpon
	del.icio.us
	Google
	Google Buzz
	reddit
	LinkedIn

	Facebook
	Twitter
	Linkedin
	E-Mail

Agentic AI > ADK Agent Testing > Testing ADK Agent Streaming Responses

Testing ADK Agent Streaming Responses

Author: Venkata Sudhakar

ShopMax India uses ADK streaming responses to show customers a live typing effect as the agent generates order status updates and product recommendations, reducing perceived wait time. Testing streaming agents requires verifying that chunks arrive in order, that the concatenated stream equals the expected full response, that the first chunk arrives within an acceptable latency, and that the stream closes cleanly without partial or duplicate tokens.

To test ADK agent streaming, replace the LLM with an async generator mock that yields a fixed sequence of chunks. Collect all chunks, assert on their order and content, and verify the assembled response matches the expected final text. Test edge cases: a single-chunk stream, an empty stream, and a stream that raises an error mid-way. These tests run fast and deterministically without real LLM calls.

The example below mocks a streaming ADK agent for ShopMax India that yields order status in three chunks. Tests verify chunk order, the assembled response, and a mid-stream error path where the consumer receives all chunks delivered before the error.

import asyncio
import pytest

async def mock_stream_agent(order_id):
    chunks = [
        "Order " + order_id + " ",
        "has been dispatched ",
        "from our Delhi warehouse."
    ]
    for chunk in chunks:
        yield chunk

async def mock_error_stream(order_id):
    yield "Partial response "
    raise RuntimeError("Stream interrupted")

async def collect_stream(generator):
    chunks = []
    async for chunk in generator:
        chunks.append(chunk)
    return chunks

def test_stream_chunks_arrive_in_order():
    chunks = asyncio.run(collect_stream(mock_stream_agent("ORD-8001")))
    assert chunks[0].startswith("Order ORD-8001")
    assert "dispatched" in chunks[1]
    assert "Delhi" in chunks[2]

def test_assembled_stream_equals_expected_response():
    chunks = asyncio.run(collect_stream(mock_stream_agent("ORD-8002")))
    full = "".join(chunks)
    assert "ORD-8002" in full
    assert "Delhi warehouse" in full

def test_stream_error_delivers_partial_chunks():
    async def run():
        chunks = []
        try:
            async for chunk in mock_error_stream("ORD-8003"):
                chunks.append(chunk)
        except RuntimeError:
            pass
        return chunks
    chunks = asyncio.run(run())
    assert len(chunks) == 1
    assert "Partial response" in chunks[0]

It gives the following output,

... (3 passed in 0.02s)

In production, ShopMax India should set a maximum stream duration and a maximum total token count per stream to prevent the agent from generating indefinitely long responses. Buffer chunks on the client side and flush to the UI every 50ms to balance responsiveness against render overhead. Monitor first-chunk latency as a separate metric from total response latency, since customers perceive the stream as starting as soon as the first token arrives.

Send your comments, suggestions or queries regarding this site to [email protected].