In Browser
	StumbleUpon
	del.icio.us
	Google
	Google Buzz
	reddit
	LinkedIn

	Facebook
	Twitter
	Linkedin
	E-Mail

Agentic AI > ADK Agent Testing > Agent Output Grounding Verification for ADK Agents

Agent Output Grounding Verification for ADK Agents

Author: Venkata Sudhakar

Agent output grounding verification tests that an ADK agent's response is anchored to the data retrieved by its tools rather than generated from the model's parametric memory. ShopMax India's product and pricing agents must cite real catalog data - a response claiming a TV costs Rs 55000 when the tool returned Rs 62000 is an ungrounded hallucination that directly affects customer trust and revenue in Chennai and Bangalore.

The grounding check extracts key facts from the tool's return value (price, stock, product name) and asserts that each fact appears verbatim or semantically in the agent's final response. In unit tests, the tool output is known exactly, so substring matching is sufficient. For LLM-generated prose, a second-pass check uses the tool output as a reference and scans the response for each required fact using simple string search or regex.

The example below defines a grounding verifier that extracts expected facts from a tool response dict, runs the agent's response generator, and asserts all required facts are present in the output.

import pytest
from typing import Dict, List

def get_product_details(product_id: str) -> Dict:
    return {
        "name": "Samsung 55-inch 4K TV",
        "price_rs": 62000,
        "stock": "In stock",
        "warehouse": "Mumbai"
    }

def generate_product_response(product_id: str) -> str:
    data = get_product_details(product_id)
    return (
        f"The {data['name']} is available for Rs {data['price_rs']}. "
        f"Status: {data['stock']} at {data['warehouse']} warehouse."
    )

def verify_grounding(response: str, tool_output: Dict) -> List[str]:
    missing = []
    checks = {
        "product_name": tool_output["name"],
        "price":        str(tool_output["price_rs"]),
        "stock_status": tool_output["stock"],
        "warehouse":    tool_output["warehouse"],
    }
    for fact_name, fact_value in checks.items():
        if fact_value not in response:
            missing.append(fact_name)
    return missing

def test_response_is_grounded():
    product_id = "PROD-TV-001"
    tool_output = get_product_details(product_id)
    response = generate_product_response(product_id)
    missing_facts = verify_grounding(response, tool_output)
    print(f"Response: {response}")
    assert not missing_facts, f"Ungrounded response - missing facts: {missing_facts}"

def test_hallucinated_price_detected():
    tool_output = get_product_details("PROD-TV-001")
    hallucinated_response = "The Samsung TV is available for Rs 45000 in Mumbai."
    missing = verify_grounding(hallucinated_response, tool_output)
    assert "price" in missing, "Hallucinated price should be flagged"
    print(f"Hallucination detected: {missing}")

It gives the following output,

Response: The Samsung 55-inch 4K TV is available for Rs 62000. Status: In stock at Mumbai warehouse.
Hallucination detected: ['price']
2 passed in 0.04s

For LLM-generated responses where paraphrasing is expected (e.g. '62 thousand' instead of '62000'), supplement string matching with a numeric extraction step that parses all currency amounts from the response and compares them to the tool output values. Run grounding tests on every prompt change and after every model upgrade since new model versions can alter how faithfully retrieved context is reflected in the output.

Send your comments, suggestions or queries regarding this site to [email protected].