tl  tr
  Home | Tutorials | Articles | Videos | Products | Tools | Search
Interviews | Open Source | Tag Cloud | Follow Us | Bookmark | Contact   
 Generative AI > Google Gemini API > Gemini Structured Output with Pydantic

Gemini Structured Output with Pydantic

Author: Venkata Sudhakar

Gemini structured output guarantees the model returns valid JSON matching your exact schema every time. Instead of asking Gemini to "please return JSON" and then hoping the output is parseable, you pass a response_schema and Gemini is constrained to produce only JSON that matches it. No markdown fences, no extra explanation text, no missing fields - just clean structured data ready to write directly into your database. This is the right approach for any data extraction pipeline where downstream code depends on consistent field names and types.

You define the schema as a Pydantic model or as a plain Python dict that follows the JSON Schema specification. Pass it in GenerateContentConfig as response_schema alongside response_mime_type set to "application/json". Gemini enforces the schema at the generation level, not as a post-processing step, so you never get a valid JSON parse error at runtime. Combined with Gemini Flash speed and pricing, this makes it practical to run structured extraction on thousands of documents per day as a real-time pipeline.

The below example shows a procurement team automating extraction of product details from unstructured supplier emails - pulling out product name, SKU, price, availability, and lead time into a clean record for their inventory system.


Extracting structured data from three different supplier email styles,


It gives the following output,

=== EXTRACTED PRODUCT QUOTES ===

Quote 1:
  Product:       20mm HDPE Pipe
  SKU:           HDPE-20-100
  Price:         Rs 145.0 per unit
  MOQ:           500 units
  Lead time:     7 days
  Availability:  IN_STOCK
  Supplier:      Sharma Plastics Pvt Ltd
  Quote valid:   15 days
  Notes:         8500 metres available for immediate dispatch

Quote 2:
  Product:       Stainless Steel Fasteners M8x30
  SKU:           SS-M8-30-100
  Price:         Rs 8.5 per unit
  MOQ:           1000 units
  Lead time:     21 days
  Availability:  OUT_OF_STOCK
  Supplier:      Allied Fasteners Mumbai
  Quote valid:   30 days
  Notes:         Next batch expected in 21 days

Quote 3:
  Product:       3-phase Electric Motor 2HP
  SKU:           not specified
  Price:         Rs 12400.0 per unit
  MOQ:           1 units
  Lead time:     0 days
  Availability:  LOW_STOCK
  Supplier:      Electro Traders Pune
  Quote valid:   not specified
  Notes:         Only 3 units left, same-day dispatch before 2pm

# Three completely different email styles - all produce identical schema
# Zero parsing errors guaranteed - schema enforced at generation level
# Ready to INSERT directly into your procurement database

Structured output use cases that run well in production: extracting contact details from business cards and email signatures, parsing purchase orders into line-item records, classifying and tagging support tickets with severity and category fields, extracting patient details from referral letters (with appropriate privacy controls), pulling financial figures from earnings announcements, and converting unstructured meeting notes into action item records. The Pydantic model serves double duty as both the schema definition and the validated Python object you write to your database - no separate validation step needed.


 
  


  
bl  br