tl  tr
  Home | Tutorials | Articles | Videos | Products | Tools | Search
Interviews | Open Source | Tag Cloud | Follow Us | Bookmark | Contact   
 Generative AI > AI Security > Secure Multi-Tenant LLM Deployments with Tenant Isolation

Secure Multi-Tenant LLM Deployments with Tenant Isolation

Author: Venkata Sudhakar

ShopMax India operates a shared LLM platform used by its retail, B2B, and marketplace divisions. Each division is a separate tenant with its own system prompt, knowledge base, and data access permissions. In a shared LLM infrastructure, strict tenant isolation prevents one tenant's data from leaking into another tenant's responses, ensures system prompt confidentiality, and enforces per-tenant rate limits and spend caps.

Tenant isolation is enforced at three layers: the request routing layer tags every request with a tenant ID and validates it against a registry; the prompt injection layer prepends a tenant-specific system prompt that scopes the LLM's knowledge; and the output filtering layer scans responses for cross-tenant data signals before returning them. Each layer is stateless and can be deployed as FastAPI middleware without modifying the core LLM call logic.

The example below implements three-layer tenant isolation middleware for ShopMax India's shared LLM platform, demonstrating both isolation enforcement and cross-tenant content filtering.


It gives the following output,

POST /chat/retail {"message": "Best laptops for home use?"}
{
  "tenant": "retail",
  "response": "For home use I recommend the HP Pavilion 15 (Rs 54,990)
               and the Lenovo IdeaPad Slim 5 (Rs 49,990)."
}

POST /chat/retail {"message": "Tell me about wholesale pricing"}
{
  "tenant": "retail",
  "response": "[Response filtered: cross-tenant content detected]"
}

Use JWT tokens to authenticate tenant requests rather than relying on URL path parameters alone - a tenant ID in the URL can be spoofed. Store per-tenant system prompts in an encrypted secrets store such as Google Cloud Secret Manager. Add per-tenant token usage counters and enforce hard caps to prevent one tenant from consuming the shared quota. Log all filter events with tenant ID, request ID, and the matched keyword for security auditing and incident response.


 
  


  
bl  br