Introduction - NeMo Guardrails

What is NeMo Guardrails?

NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational applications. Guardrails (or “rails” for short) are specific ways of controlling the output of a large language model, such as not talking about politics, responding in a particular way to specific user requests, following a predefined dialog path, using a particular language style, extracting structured data, and more. The toolkit enables developers building LLM-based applications to easily add programmable guardrails between the application code and the LLM, providing a critical control layer for production deployments.

NeMo Guardrails is developed by NVIDIA and is licensed under Apache 2.0. Learn more in the research paper published at EMNLP 2023.

Why NeMo Guardrails?

Building production-ready LLM applications requires more than just connecting to an API. You need control, safety, and reliability. NeMo Guardrails provides:

Safety & Trust

Define rails to guide and safeguard conversations. Prevent your LLM from engaging in unwanted topics or generating harmful content.

Controllable Dialog

Steer the LLM to follow pre-defined conversational paths, allowing you to design interactions following conversation design best practices.

Secure Tool Integration

Connect LLMs to other services (tools) seamlessly and securely with execution rails that validate inputs and outputs.

Multi-Layer Protection

Apply guardrails at five distinct points: input, dialog, retrieval, execution, and output for comprehensive control.

Key Features

Five Types of Guardrails

NeMo Guardrails supports five main types of guardrails that can be applied at different stages:

Input Rails

Applied to user input before processing. Can reject or alter input (e.g., mask sensitive data, rephrase).

Dialog Rails

Influence how the LLM is prompted and control conversational flow. Determine if actions should execute or predefined responses should be used.

Retrieval Rails

Applied to retrieved chunks in RAG scenarios. Can reject or modify chunks before they’re used to prompt the LLM.

Execution Rails

Applied to input/output of custom actions (tools) that the LLM needs to call.

Output Rails

Applied to LLM-generated output before returning to the user. Can reject or alter output (e.g., remove sensitive data).

Built-in Guardrails Library

NeMo Guardrails comes with a comprehensive library of pre-built guardrails:

LLM Self-Checking: Input/output moderation, fact-checking, hallucination detection
NVIDIA Safety Models: Content safety, topic safety
Jailbreak & Injection Detection: Protect against prompt injection attacks
Third-Party Integrations: ActiveFence, AlignScore, and more

Built-in guardrails may not be suitable for all production use cases. Work with your team to ensure guardrails meet requirements for your industry and use case.

Colang Modeling Language

NeMo Guardrails introduces Colang, a modeling language specifically created for designing flexible, yet controllable, dialogue flows. Colang has a Python-like syntax and makes it easy to:

Define conversational patterns and flows
Specify allowed and disallowed topics
Create custom input/output validation rules
Implement complex dialog control logic

Two versions of Colang are supported: 1.0 (default) and 2.0. Both are fully supported with extensive examples.

Use Cases

You can use programmable guardrails in different types of applications:

Question Answering / RAG

Enforce fact-checking and output moderation over a set of documents using Retrieval Augmented Generation.

Domain-specific Assistants

Ensure chatbots stay on topic and follow designed conversational flows for support, sales, or specialized domains.

LLM Endpoints

Add guardrails to your custom LLM deployments for safer customer interaction.

LangChain Integration

Wrap guardrails around any LangChain chain or Runnable for enhanced safety and control.

LLM Vulnerability Protection

NeMo Guardrails provides robust protection against common LLM vulnerabilities:

The toolkit includes mechanisms for protecting against:

Jailbreak attempts
Prompt injection attacks
Sensitive data leakage
Off-topic conversations
Harmful content generation

How It Works

The basic flow is simple:

Load a guardrails configuration from YAML and Colang files
Create an LLMRails instance with your configuration
Call the LLM through the guardrails layer using generate() or generate_async()

from nemoguardrails import LLMRails, RailsConfig

# Load configuration
config = RailsConfig.from_path("PATH/TO/CONFIG")
rails = LLMRails(config)

# Generate with guardrails
completion = rails.generate(
    messages=[{"role": "user", "content": "Hello world!"}]
)

The input and output format is compatible with OpenAI’s Chat Completions API, making integration straightforward.

What Makes NeMo Guardrails Different?

While there are many approaches to adding guardrails to LLMs, NeMo Guardrails stands out by:

Providing a unified toolkit that integrates multiple complementary approaches (moderation endpoints, critique chains, parsing, individual guardrails)
Offering dialog modeling capabilities that enable both precise dialog control and fine-grained guardrail application
Supporting multiple LLMs including OpenAI GPT-3.5/4, LLaMa-2, Falcon, Vicuna, Mosaic, and more
Being async-first with full support for both sync and async APIs
Integrating seamlessly with LangChain and other popular frameworks

Next Steps

Installation

Get NeMo Guardrails installed and ready to use

Quickstart

Build your first guardrails-protected application

GitHub Repository

View the source code and contribute

Official Documentation

Read the comprehensive documentation

Documentation Index

​What is NeMo Guardrails?

​Why NeMo Guardrails?

Safety & Trust

Controllable Dialog

Secure Tool Integration

Multi-Layer Protection

​Key Features

​Five Types of Guardrails

​Built-in Guardrails Library

​Colang Modeling Language

​Use Cases

​LLM Vulnerability Protection

​How It Works

​What Makes NeMo Guardrails Different?

​Next Steps

Installation

Quickstart

GitHub Repository

Official Documentation

What is NeMo Guardrails?

Why NeMo Guardrails?

Key Features

Five Types of Guardrails

Built-in Guardrails Library

Colang Modeling Language

Use Cases

LLM Vulnerability Protection

How It Works

What Makes NeMo Guardrails Different?

Next Steps