tl  tr
  Home | Tutorials | Articles | Videos | Products | Tools | Search
Interviews | Open Source | Tag Cloud | Follow Us | Bookmark | Contact   
 Generative AI > Anthropic Claude API > Claude with FastAPI - Building a Production REST API

Claude with FastAPI - Building a Production REST API

Author: Venkata Sudhakar

Wrapping Claude in a FastAPI service gives ShopMax India a production-grade REST API for AI features. Customer-facing apps, internal dashboards, and mobile apps all call the same endpoint, while Claude access is centralized with rate limiting, authentication, and logging in one place.

FastAPI's async support pairs naturally with the Anthropic async client. Requests hit the endpoint, the async Claude call runs without blocking other requests, and the response streams back. Pydantic models validate both request payloads and response shapes. Adding API key authentication via a header dependency protects the endpoint from unauthorized use.

The following example shows ShopMax India's product assistant endpoint. It accepts a customer query and product context, calls Claude with a ShopMax system prompt, and returns a structured JSON response including the answer and token usage.


It gives the following output,

INFO:     Started server process [12345]
INFO:     Uvicorn running on http://0.0.0.0:8000

# Test with curl:
# curl -X POST http://localhost:8000/api/v1/product-assistant \
#   -H "x-api-key: shopmax-internal-key-2024" \
#   -H "Content-Type: application/json" \
#   -d '{"customer_id": "CUST-881", "question": "Which TV under Rs 50000 has best picture quality?"}'

{
  "customer_id": "CUST-881",
  "answer": "For TVs under Rs 50,000 at ShopMax India, the LG 43-inch 4K UHD offers excellent picture quality with IPS panel technology...",
  "input_tokens": 98,
  "output_tokens": 87
}

In production at ShopMax India, add request timeout middleware (typically 30 seconds for Claude calls) to prevent hung connections. Use a connection pool via AsyncAnthropic with httpx limits set to match your expected concurrency. Add structured logging with correlation IDs so you can trace a specific customer's AI request across logs. For high-traffic periods like Diwali sales, cache common product queries in Redis with a short TTL to avoid redundant Claude calls for the same popular questions.


 
  


  
bl  br