|
|
NeMo Guardrails - Adding Safety Rails and Topic Restrictions to LLM Applications
Author: Venkata Sudhakar
ShopMax India's AI assistant must stay focused on electronics products and avoid answering off-topic questions about competitors, politics, or unrelated subjects. NVIDIA NeMo Guardrails provides a declarative framework to define conversation flows, topic restrictions, and safety checks using Colang configuration files. It wraps any LLM and intercepts conversations before they reach the model.
NeMo Guardrails uses Colang files to define rails - user message patterns and the allowed bot responses. Rails are organized into input rails (check before sending to LLM), output rails (check LLM response), and dialog rails (control conversation flow). A config.yml file specifies the LLM model and the Colang files to load. The LLMRails class loads the config and processes every message through the guardrail pipeline.
The example below configures ShopMax India's assistant with a topic restriction rail that blocks off-topic queries. The Colang config defines a pattern for detecting off-topic questions and a safe fallback response written to a local config directory.
It gives the following output,
User: What laptops do you have under Rs 30000?
Bot : Here are some laptops under Rs 30000 at ShopMax India: Lenovo IdeaPad Slim 3...
User: What is the weather in Mumbai today?
Bot : I only assist with ShopMax India electronics and orders.
User: Is the Samsung Galaxy S24 available in Chennai?
Bot : Yes, the Samsung Galaxy S24 is available at our Chennai store and online...
In production, version your Colang rail files in git alongside your application code. Use NeMo's built-in topical rails for common categories - politics, medical advice, financial advice - and write custom rails for domain restrictions. Test rails using the nemoguardrails chat command before deploying. Monitor the topical rail trigger rate in your observability dashboard - a sudden spike may indicate a prompt injection campaign targeting your application.
|
|