DEV Community

Chloe Williams for Zilliz

Posted on • Originally published at zilliz.com

Nemo Guardrails: Elevating AI Safety and Reliability

Introduction

Nowadays, the importance of AI safety and reliability is growing as organizations increasingly rely on AI technologies. Ensuring AI systems' accuracy, dependability, and robustness is essential for ethical reasons and for maintaining trust and credibility in the market. As AI becomes more integrated into various sectors, prioritizing safety is crucial to prevent errors and protect against attacks.

Now, we have NVIDIA NeMo Guardrails, an open-source toolkit that provides the necessary tools to ensure that smart applications utilizing large language models (LLMs) are accurate, relevant, appropriate, and secure. This toolkit includes all the code, examples, and documentation required for businesses to enhance the safety of AI-driven text generation applications.

dalle.jpeg

Understanding Nemo Guardrails

Understanding Nemo Guardrails
Image source


Image Source

AI's automated text generation can come with many potential risks and inaccuracies. Hence, we need something that can act as a barrier to prevent the generated text from going off track, a guardrail, to be exact. That's where Nemo Guardrails comes in.

In programming, guardrails consist of programmable rules or constraints that act as intermediaries between both the application code or the user and the large language model (LLM). These guardrails oversee, influence, and control the interactions a user has with the chatbot. NeMo Guardrails supports three main types of guardrails, which are Topical guardrails, Safety guardrails, and Security guardrails.

Nemo Guardrails

Let's start by explaining topical guardrails designed to keep the AI's responses within the scope of the user's query, ensuring that the generated text remains on-topic and relevant to the user's needs. Then we have Safety guardrails, which have built-in filters and monitors to ensure that all generated content adheres to predefined ethical standards and is free from biases, offensive language, or inappropriate content. Finally, Security guardrails protect against potential attacks on the AI models, such as data poisoning or model inversion attacks. These features help secure sensitive data and maintain user trust.

In addition to that, Nemo Guardrails provides comprehensive documentation and many practical examples. This enables developers to understand how to implement and customize the AI safety features according to their specific requirements.

Let’s now look at an example of how to install, implement, and use Nemo Guardrails.

Step 1: Install

pip install nemoguardrails
Enter fullscreen mode Exit fullscreen mode

Step 2: Create a new guardrails configuration

Every guardrails’ configuration must be stored in a folder. The standard folder structure is as follows:

.
├── config
│   ├── actions.py
│   ├── config.py
│   ├── config.yml
│   ├── rails.co
│   ├── ...
Enter fullscreen mode Exit fullscreen mode

Create a folder, such as config, for your configuration:

mkdir config
Enter fullscreen mode Exit fullscreen mode

The config.yml contains all the general configuration options (e.g., LLM models, active rails, custom configuration data), the config.py contains any custom initialization code and the actions.py contains any custom Python actions.

Here’s an example of config.yml:

# config.yml
models:
  - type: main
    engine: openai
    model: gpt-3.5-turbo-instruct

rails:
  # Input rails are invoked when new input from the user is received.
  input:
    flows:
      - check jailbreak
      - mask sensitive data on input

  # Output rails are triggered after a bot message has been generated.
  output:
    flows:
      - self check facts
      - self check hallucination
      - activefence moderation
      - gotitai rag truthcheck

  config:
    # Configure the types of entities that should be masked on user input.
    sensitive_data_detection:
      input:
        entities:
          - PERSON
          - EMAIL_ADDRESS
Enter fullscreen mode Exit fullscreen mode

This guardrail configuration specifies rules for ensuring the safety and reliability of an AI model, particularly focused on input and output interactions. Let's break down each section starting with input rails, covering its configured flows which include, Check Jailbreak which checks for any attempts to exploit vulnerabilities in the AI model or the system it's running on, such as unauthorized access or manipulation. mask sensitive data on input which masks or hides sensitive information (e.g., personal names, email addresses) received from the user to protect privacy and security.

Next, we have output rails which have the configured flows self check facts, self check hallucination, activefence moderation, gotitai rag truthcheck. self check facts verifies the factual accuracy of the generated response by cross-referencing it with reliable sources or databases. self check hallucination checks for any hallucinatory or nonsensical content in the generated response, ensuring coherence and relevance. activefence moderation which involves moderation mechanisms provided by ActiveFence for filtering out harmful or inappropriate content from the generated response. gotItai Rag truthcheck utilizes a truth-checking service provided by GotItAI to validate the accuracy of the generated response against factual information.

Overall, this guardrail configuration aims to enforce various safety measures throughout the AI interaction process, including input validation, output verification, and protection of sensitive data, thereby enhancing the reliability and trustworthiness of the AI model.

programmable guardrails

Source

Step 3: Load and use the guardrails configuration

Loading the guardrails configuration and creating an LLMRails instance then making the calls to the LLM using the generate/generate_async methods.

from nemoguardrails import LLMRails, RailsConfig

# Load a guardrails configuration from the specified path.
config = RailsConfig.from_path("PATH/TO/CONFIG")
rails = LLMRails(config)

completion = rails.generate(
    messages=[{"role": "user", "content": "Hello world!"}]
)
Enter fullscreen mode Exit fullscreen mode

Sample output

{"role": "assistant", "content": "Hi! How can I help you?"}
Enter fullscreen mode Exit fullscreen mode

As you can see from the code above, implementing and using Nemo guardrails is really simple with just three steps to the process and I hope you also noticed how you can easily customize the safety features according to your specific requirements in the configuration files.

Practical Applications

Nemo Guardrails can be exceptionally useful in various real-world applications, particularly where large language models (LLMs), including RAG pipelines, must be carefully managed to avoid risks and ensure safety and compliance. A few of these applications include healthcare advice platforms, educational tools, and financial advice.

When it comes to healthcare advice platforms, there is no room for error as this could potentially cause harm to the user, so a guardrail is crucial to prevent that from happening. In educational tools, NeMo Guardrails can help ensure that content provided by LLMs is accurate, age-appropriate, and pedagogically sound. Finally, for financial advice tools, ensuring that the information provided by the machine learning model is accurate and not misleading is critical as giving unauthorized or speculative financial recommendations could have serious repercussions.

Integrating Nemo Guardrails

Integrating NeMo Guardrails into AI models and systems involves several best practices to ensure effective and safe usage of large language models (LLMs) across different platforms and technologies. Focusing on NVIDIA's ecosystem, which supports a wide range of AI technologies, there are a few things you should consider when incorporating Nemo Guardrails.

First, start by clearly defining what you need the guardrails to achieve. This could include content moderation, data privacy, compliance with specific regulations, or preventing the model from generating harmful content. Then, deeply understand the capabilities and limitations of the AI models you use, as this will help you correctly configure your guardrails. Next, customize guardrails to the specific needs of your application and try to regularly test them in real-world scenarios to ensure they are working as intended.

Moreover, I recommend Implementing mechanisms to gather user feedback on the AI's performance regarding the guardrails. You would then use this feedback to fine-tune the guardrails and improve user satisfaction and safety. Last but not least, Ensure that the implementation of guardrails adheres to ethical AI principles, promoting fairness, accountability, and transparency.

Let's now talk about Nemo Guardrails compatibility with AI technologies and platforms. As you might know, Nemo Guardrails are particularly relevant within NVIDIA's ecosystem, which includes various tools and platforms that can leverage these guardrails. One of the great tools you can integrate with Nemo Guardrails is Riva, NVIDIA's service for deploying AI models, which allows developers to seamlessly apply guardrails to both training and inference stages. Next, we have NVIDIA's platforms, such as DGX systems and CUDA-X AI libraries, which are designed for scalable AI deployment as guardrails should be scalable and efficient to not hinder the performance benefits provided by NVIDIA hardware.

Nemo Guardrails is compatible with various deep learning frameworks such as TensorFlow, PyTorch, and others commonly used within NVIDIA's ecosystem. NeMo Guardrails should interact flawlessly with these frameworks to apply constraints or modifications at different stages of the model lifecycle.

Conclusion

I hope by now you understand how Nemo Guardrails has revolutionized both AI reliability and safety features in AI, what guardrails are and how they work, practical AI applications, the simple implementation of Nemo Guardrails into your AI integration along with the best practices for it.

We’ve talked about how Nemo Guardrails robust framework works both efficiently and effectively to ensure that smart applications or conversational AI utilizing large language models (LLMs) are accurate, relevant, appropriate, and also secure. NeMo Guardrails enables developers all around the world to enforce specific standards and rules that ensure AI interactions remain safe, appropriate, and compliant with regulatory requirements. By filtering, modifying, or redirecting AI outputs, these guardrails help prevent the propagation of harmful content, protect user data privacy, and uphold ethical standards, thus fostering trust and confidence in AI applications.

As AI continues to evolve, incorporating advanced safety features like NeMo Guardrails will be crucial in navigating the challenges posed by these powerful technologies. Developers, researchers, and organizations should prioritize understanding and implementing these guardrails, ensuring that AI systems are not only effective but also aligned with societal values and safety standards.

Top comments (0)