In Browser
	StumbleUpon
	del.icio.us
	Google
	Google Buzz
	reddit
	LinkedIn

	Facebook
	Twitter
	Linkedin
	E-Mail

Generative AI > Guardrails and Evaluation > LLM Hallucination Detection with SelfCheckGPT

LLM Hallucination Detection with SelfCheckGPT

Author: Venkata Sudhakar

ShopMax India's product Q&A bot sometimes generates confident but incorrect answers - wrong prices, outdated specs, or fabricated store locations. SelfCheckGPT detects hallucinations by sampling the LLM multiple times with the same prompt and measuring consistency across outputs. Inconsistent facts across samples signal hallucination rather than grounded knowledge.

SelfCheckGPT works on the principle that if an LLM truly knows a fact, it will state it consistently across multiple samples. If outputs disagree, the claim is likely hallucinated. The library scores each sentence using BERTScore similarity across N sampled passages. A score close to 1 means high inconsistency - hallucination risk. A score near 0 means the fact is consistently stated across samples.

The example below runs hallucination detection on ShopMax India product queries. It samples each question 4 times at high temperature, then scores the primary response against the additional samples using SelfCheckBERTScore.

from selfcheckgpt.modeling_selfcheck import SelfCheckBERTScore
from openai import OpenAI

client = OpenAI(api_key="your-api-key")
selfcheck = SelfCheckBERTScore(rescale_with_baseline=True)

def get_samples(question, n=4):
    responses = []
    for _ in range(n):
        resp = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{"role": "user", "content": question}],
            temperature=1.0
        )
        responses.append(resp.choices[0].message.content)
    return responses

def detect_hallucination(question):
    samples = get_samples(question)
    primary = samples[0]
    others = samples[1:]
    scores = selfcheck.predict(sentences=[primary], sampled_passages=others)
    return primary, scores[0]

queries = [
    "What is the price of Samsung Galaxy S24 Ultra at ShopMax India?",
    "Does ShopMax India have a store in Hyderabad near Banjara Hills?"
]

for q in queries:
    response, score = detect_hallucination(q)
    label = "HALLUCINATION RISK" if score > 0.5 else "RELIABLE"
    print(f"Q: {q}")
    print(f"A: {response[:75]}...")
    print(f"Score: {score:.3f} [{label}]")
    print()

It gives the following output,

Q: What is the price of Samsung Galaxy S24 Ultra at ShopMax India?
A: The Samsung Galaxy S24 Ultra is priced at Rs 134999 at ShopMax India...
Score: 0.312 [RELIABLE]

Q: Does ShopMax India have a store in Hyderabad near Banjara Hills?
A: Yes, ShopMax India operates a flagship store near Banjara Hills in Hyderabad...
Score: 0.631 [HALLUCINATION RISK]

In production, run SelfCheckGPT on responses before serving them to customers. Flag high-scoring responses for human review rather than blocking them outright - a score above 0.6 warrants review while above 0.8 should be blocked and replaced with a safe fallback. Use the BERTScore method for semantic consistency and NLI method for factual contradiction detection. Store hallucination scores in your observability database to track which question types are most prone to hallucination.

Send your comments, suggestions or queries regarding this site to [email protected].