In Browser
	StumbleUpon
	del.icio.us
	Google
	Google Buzz
	reddit
	LinkedIn

	Facebook
	Twitter
	Linkedin
	E-Mail

Generative AI > AI Security > Securing LLM APIs - Rate Limiting and Authentication with FastAPI

Securing LLM APIs - Rate Limiting and Authentication with FastAPI

Author: Venkata Sudhakar

Exposing an LLM-powered API without proper security controls invites abuse - from rate limit exhaustion that drives up OpenAI costs to unauthorized access that leaks customer data. ShopMax India's AI product search and recommendation API, used by millions of customers across Hyderabad, Chennai, and Mumbai, must enforce API key authentication and per-key rate limiting to ensure fair use and prevent denial-of-service scenarios caused by runaway clients or scrapers.

FastAPI provides a clean foundation for building secure LLM APIs. API key authentication is implemented as a dependency that reads the X-API-Key header and validates it against a registry. Rate limiting tracks request counts per key using an in-memory store (or Redis in production) with a sliding window. When a key exceeds its quota, the API returns HTTP 429 Too Many Requests. Each API key can have its own tier - for example, ShopMax India's mobile app gets 1000 requests per minute while third-party integrations get 100.

The following example builds a FastAPI application with API key auth and per-key rate limiting that fronts a call to an LLM for product recommendations. It demonstrates key validation, quota tracking, and graceful rejection with informative error messages.

from fastapi import FastAPI, HTTPException, Depends, Request
from fastapi.security import APIKeyHeader
from openai import OpenAI
from collections import defaultdict
import time

app = FastAPI()
client = OpenAI(api_key="sk-...")
api_key_header = APIKeyHeader(name="X-API-Key")

API_KEYS = {
    "shopmax-mobile-v1": {"limit": 1000, "tier": "premium"},
    "shopmax-partner-v1": {"limit": 100, "tier": "standard"},
}

request_counts = defaultdict(list)
WINDOW = 60

def validate_api_key(api_key: str = Depends(api_key_header)):
    if api_key not in API_KEYS:
        raise HTTPException(status_code=403, detail="Invalid API key")
    return api_key

def rate_limit(api_key: str = Depends(validate_api_key)):
    now = time.time()
    window_start = now - WINDOW
    request_counts[api_key] = [t for t in request_counts[api_key] if t > window_start]
    limit = API_KEYS[api_key]["limit"]
    if len(request_counts[api_key]) >= limit:
        raise HTTPException(status_code=429, detail="Rate limit exceeded. Try again in 60 seconds.")
    request_counts[api_key].append(now)
    return api_key

@app.get("/recommend")
def recommend(product: str, city: str, api_key: str = Depends(rate_limit)):
    prompt = "Recommend 3 accessories for " + product + " for a customer in " + city + ", India. Keep it brief."
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}]
    )
    return {"product": product, "city": city, "recommendations": response.choices[0].message.content}

It gives the following output,

# Valid request
GET /recommend?product=OnePlus+12&city=Hyderabad
X-API-Key: shopmax-mobile-v1

{
  "product": "OnePlus 12",
  "city": "Hyderabad",
  "recommendations": "1. Sandstone case Rs 499\n2. 80W charger Rs 1,299\n3. Screen protector Rs 299"
}

# Invalid key
{"detail": "Invalid API key"}

# Rate limit exceeded
{"detail": "Rate limit exceeded. Try again in 60 seconds."}

In production, replace the in-memory request_counts dict with Redis using a sorted set per key - this survives restarts and works across multiple API server instances. Add JWT-based auth for ShopMax India's internal services to avoid sharing static keys. Store API key metadata (owner, creation date, last used) in a database so you can revoke compromised keys instantly. Monitor the rate limit hit rate in Grafana - a key consistently hitting its quota is a signal either the quota is too low or the client has a bug making redundant calls.

Send your comments, suggestions or queries regarding this site to [email protected].