|
|
Contrastive Chain-of-Thought - Using Good and Bad Examples
Author: Venkata Sudhakar
Contrastive chain-of-thought prompting provides both a correct worked example and an incorrect worked example, asking the LLM to reason like the correct one and avoid the mistakes in the incorrect one. At ShopMax India, when training an LLM to assess warranty claim validity, showing both a good assessment (claim approved with clear reasoning) and a bad assessment (claim denied without checking policy) dramatically sharpens the quality of subsequent judgments.
The contrastive approach works because LLMs learn from contrast: seeing what NOT to do is often as instructive as seeing what to do. The bad example highlights failure patterns - circular reasoning, missed facts, policy violations - that the model might otherwise reproduce. The good example shows the target reasoning style. Together they bracket the expected behavior more precisely than a single positive example alone.
The example below shows ShopMax India using contrastive CoT for warranty claim assessment. One good and one bad example are embedded in the prompt, followed by three new claims for the model to assess.
It gives the following output,
Claim: Whirlpool washing machine, purchased 5 months ago, drum n
Assessment: Reasoning: 5 months is within the 24-month warranty. Drum failure
is a covered mechanical defect. No misuse mentioned.
Decision: APPROVED - mechanical defect within warranty period.
Claim: Sony TV, purchased 18 months ago, remote control broken.
Assessment: Reasoning: 18 months exceeds the 12-month warranty period.
Decision: DENIED - warranty expired 6 months ago. Offer out-of-warranty repair.
Claim: Daikin AC, purchased 3 months ago, compressor noise. War
Assessment: Reasoning: Within 60-month warranty. However, customer dropped unit
during installation - this constitutes physical damage from misuse, not a defect.
Decision: DENIED - physical damage from customer mishandling voids warranty.
The contrastive examples produce precise, policy-grounded decisions that mirror real assessor reasoning. At ShopMax India, curate a contrastive example library from historical warranty decisions - include real approved and denied cases with anonymized data. Update the bad examples when new failure modes appear in assessor outputs. For high-stakes decisions like large-value warranty claims, chain contrastive CoT with a second review call that checks the first decision against the same examples to catch inconsistencies.
|
|