tl  tr
  Home | Tutorials | Articles | Videos | Products | Tools | Search
Interviews | Open Source | Tag Cloud | Follow Us | Bookmark | Contact   
 Generative AI > AI Security > Secrets Detection in LLM Inputs with Python

Secrets Detection in LLM Inputs with Python

Author: Venkata Sudhakar

ShopMax India's LLM-powered support chatbot accepts free-text input from thousands of users daily. Users sometimes paste sensitive data into chat windows - API keys, credit card numbers, Aadhaar numbers, or internal order tokens. Sending this data to a third-party LLM API violates data protection obligations and exposes users to risk. A secrets detection layer scans every input before it reaches the LLM and redacts sensitive patterns automatically.

The detection layer uses regex patterns matched against known secret formats: API key prefixes, credit card patterns, Indian mobile numbers, PAN card formats, and Aadhaar patterns. When a match is found, the input is redacted with a placeholder before being passed to the LLM so the conversation can continue without the sensitive data. All redaction events are logged with the secret type and session ID for compliance reporting.

The example below shows a secrets scanner for ShopMax India that detects and redacts common Indian and global secret patterns before sending user input to the LLM.


It gives the following output,

WARNING: Redacted [('aadhaar', 1), ('indian_mobile', 1)] from user input
Response: I can see your order ORD-9821 is delayed. Your personal details have
been removed for security. Please check your registered email for tracking
updates or contact support at [email protected] with your order ID.

Log all redaction events with the secret type and session ID but never log the actual secret value. Extend the pattern list regularly as new API key formats are released by cloud providers. For credit card numbers, apply the Luhn algorithm in addition to regex to reduce false positives. In high-volume deployments, consider using an open-source library such as Microsoft Presidio for a comprehensive PII detection suite that covers dozens of entity types out of the box.


 
  


  
bl  br