Gabriel L. Manor for Permit.io

Posted on Nov 8 • Originally published at permit.io on Nov 8

Building AI Applications with Enterprise-Grade Security Using RAG and FGA

#ai #llm #python #langchain

This post is written by Bartosz Pietrucha

Introduction

Building enterprise-grade LLM applications is a necessity in today's business environment. While the accessibility of models and APIs is improving, one significant challenge remains: ensuring their security and effectively managing their permissions.

To address this, Fine-Grained Authorization (FGA) and Retrieval Augmented Generation (RAG) are effective strategies for building secure, context-aware AI applications that maintain strict access control. In this article, we’ll explore how FGA and RAG can be applied in a healthcare setting while safeguarding sensitive data.

We’ll do this by guiding you through implementing a Relationship-Based Access Control (ReBAC)authorization system that supports real-time updates with three tools: AstraDB, Langflow, and Permit.io.

Use Case Example: Healthcare Applications

To better understand the complexity of authorization in LLM applications, and the solutions offered by FGA and RAG, we can look at the digital healthcare space - as it presents a perfect example where both AI capabilities and stringent security are essential. Healthcare providers increasingly want to leverage LLMs to streamline workflows, improve decision-making, and provide better patient care. Doctors and patients alike want easy access to medical records through intuitive AI interfaces such as chatbots.

However, medical data is highly sensitive and should be carefully regulated. While LLMs can provide intelligent insights, we must ensure that they only access and reveal information that users are authorized to see. Doctors, for example, should only see diagnoses from their assigned medical centers, and patients should only be able to access their own records.

Security Through Fine-Grained Authorization

Proceeding with the digital healthcare example, let’s look at an example of a medical application.

This application is comprised of several resources, a couple of roles, and a few relationships between these entities:

Resource Types :
- Medical Centers (e.g., London, Warsaw)
- Visits (e.g., Morning Visit, Afternoon Visit)
- Diagnoses (e.g., Diabetes, Headache, Virus)
Roles :
- Doctors (e.g., Dr Bartosz)
- Patients (e.g., Gabriel, Olga)
Relationships :
- Doctors are assigned to Medical Centers
- Visits belong to Medical Centers
- Diagnoses are part of Visits
- Patients are connected to their Visits

As you can see, the hierarchal relationships of our resources mean implementing traditional role-based access control, where permissions are assigned directly, will be insufficient.

The complexity of this applications authorization will require us to use more fine-grained authorization (FGA) solutions - in this case, Relationship-Based Access Control (ReBAC).

ReBAC, an authorization model inspired by Google's Zanzibar paper, derives permissions from relationships between entities in the system - unlike traditional role-based access control (RBAC), where permissions are assigned directly.

The power of ReBAC lies in how permissions are derived through these relationships. Let’s look at a visual representation of our example:

In the above example, Dr Bartosz has access to the Virus diagnosis not because of a directly granted permission but rather because he is assigned to Warsaw Medical Center, which contains the Afternoon Visit, which contains the diagnosis. Thus, the relationships between these resources form a chain that allows us to derive access permissions.

There are clear benefits to using this approach:

It models real-world organizational structures naturally
Permissions adapt automatically as relationships change
It provides fine-grained control while remaining scalable

But the challenge doesn’t end there - as we are building a system that needs to work with LLMs, it needs to have the ability to evaluate these relationship chains in real-time. In the next section, we will learn how to create an implementation that allows that.

Before we continue, let's quickly review the authorization rules we want to ensure are in place:

Only doctors with valid relationships to a medical center can see its visits
Access to diagnoses is automatically derived from these relationships
Changes in relationships (e.g., doctor reassignment) immediately affect access rights

These requirements can be achieved through the use of Retrieval Augmented Generation (RAG).

Retrieval Augmented Generation (RAG)

RAG (Retrieval Augmented Generation) is a technique that enhances LLM outputs by combining two key steps: first, retrieving relevant information from a knowledge base, and then using that information to augment the LLM's context for more accurate generation. While RAG can work with traditional databases or document stores, vector databases are particularly powerful for this purpose because they can perform semantic similarity search, finding conceptually related information even when exact keywords don't match.

In practice, this means that when a user asks about "heart problems," the system can retrieve relevant documents about "cardiac issues" or "cardiovascular disease," making the LLM's responses both more accurate and comprehensive. The "generation" part then involves the LLM synthesizing this retrieved context with its pre-trained knowledge to produce relevant, factual responses that are grounded in your specific data.

For our implementation, we will use AstraDB as our vector database. AstraDB offers the following benefits:

It efficiently stores and searches through embeddings
It scales well with growing data
It integrates well with LLM chains like Langflow (which we will cover later in the article)

To implement our RAG pipeline, we'll also use LangFlow, an open-source framework that makes building these systems intuitive through its visual interface. LangFlow systems can be developed with a Python environment running locally or in the cloud-hosted DataStax platform. In our case, we are choosing the second option by creating a serverless (vector) AstraDB database under: https://astra.datastax.com

In our implementation, authorization checks should happen at a crucial moment - after retrieving data from the vector database but before providing it to the LLM as context. This way, we maintain search efficiency by first finding all relevant information and later filtering out unauthorized data before it ever reaches the LLM. The LLM can only use and reveal information the user is authorized to see.

These security checks are implemented using Permit.io, which provides the infrastructure for evaluating complex relationship chains in real time. As your data grows and relationships become more complex, the system continues to ensure that each piece of information is only accessible to those with proper authorization.

To get started with Permit, you can easily create a free account by visiting the website at https://app.permit.io. Once your free account is created, you'll have access to Permit's dashboard, where you can set up your authorization policies, manage users and roles, and integrate Permit into your applications. The free tier offers all the necessary features to create a digital healthcare example with relationship-based access control (ReBAC).

Both LangFlow and Permit offer free accounts to start work, so you don’t have to pay anything to build such a system and see how it works for yourself.

Implementation Guide

Before we dive into the implementation details, it's important to understand the tool we'll be using - Langflow. Built on top of LangChain, Langflow is an open-source framework that simplifies the creation of complex LLM applications through a visual interface. LangChain provides a robust foundation by offering standardized components for common LLM operations like text splitting, embedding generation, and chain-of-thought prompting. These components can be assembled into powerful pipelines that handle everything from data ingestion to response generation.

What makes Langflow particularly valuable for our use case is its visual builder interface, which allows us to construct these pipelines by connecting components graphically - similar to how you might draw a flowchart. This visual approach makes it easier to understand and modify the flow of data through our application, from initial user input to the final authorized response. Additionally, Langflow's open-source nature means it's both free to use and can be extended with custom components, which is crucial for implementing our authorization checks.

Our Langflow solution leverages two distinct yet interconnected flows to provide secure access to medical information:

1. Ingestion Flow

The ingestion flow is responsible for loading diagnoses into AstraDB along with their respective embeddings. We use MistralAI to generate embeddings for each diagnosis, making it possible to perform semantic searches on the diagnosis data later. The key components involved in this flow are:

Create List: This component is used to create a list of diagnoses to ingest into AstraDB.
MistralAI Embeddings : This component generates embeddings for each diagnosis, which are stored in AstraDB.
AstraDB : AstraDB serves as the vector store where the diagnoses and their embeddings are stored for further retrieval.

2. Chat Flow

The chat flow is responsible for interacting with users and serving them the required diagnosis data. The images below are supposed to be read from left to right (the right side of the first one continues as the left side of the second one):

💡 Note: There is an additional “_ Pip Install” _ component that is executed only once to install permit module. This is because we are implementing LangFlow on DataStax low-code platform. This step is equivalent to executing pip install permit locally.

The sequence of operations in the Chat Flow is as follows:

User Input : The user initiates the interaction by typing a query.

Example: "Do we have any patients with diabetes diagnosis?"

Retrieve Diagnoses : AstraDB is queried for relevant diagnoses based on the user's input.

Example search result (marked with 1 on the flow image above):

Filter Data Based on Permissions : Before sending the response to the next processing component, creating the context for LLM responding to the initial query, we filter the retrieved diagnoses using a custom PermitFilter component to ensure the user has the right to view each diagnosis.

Example filtered results (marked with 2 on the flow image above):

Generate Response : Once filtered, the permitted diagnoses are used as the context to generate a response for the user prompt using MistralAI.

Example prompt with context filtered with authorization step:

Seasonal Migraine
Flu virus with high fever

---
You are a doctor's assistant and help to retrieve information about patients' diagnoses.
Given the patients' diagnoses above, answer the question as best as possible.
The retrieved diagnoses may belong to multiple patients.

Question: list all the recent diagnoses

Answer:

PermitFilter Component

To run the PermitFilter component, which plays a crucial role in our implementation, we need a running instance of Permit's Policy Decision Point (PDP). The PDP is responsible for evaluating policies and making decisions on whether a given action is permitted for a specific user and resource. By enforcing this permission check before the context reaches the language model, we prevent the leakage of sensitive information and ensure the enforcement of access control policies.

See It In Action

The complete implementation is available in our GitHub Repository, where you'll find:

Custom LangFlow components
Permit.io integration code
Detailed setup instructions
Example queries and responses

To start interacting with our AI assistant with authorization checks implemented we can simply start the LangFlow playground. In the example below, I am authenticated as bartosz@health.app which means I have access to only Afternoon-Visit and Evening-Visit without Morning-Visit with Diabetes. This means that the LLM does not have the information about diabetes in its context.

Conclusion

Securing access to sensitive healthcare data while leveraging LLM capabilities is both a priority and a challenge. By combining RAG and fine-grained authorization, we can build AI applications that are both intelligent and secure. The key benefits are:

Context-aware responses through RAG
Precise access control through ReBAC
Natural modeling of organizational relationships
Scalable security that adapts to changing relationships

Using tools like LangFlow and Permit.io, healthcare providers can implement relationship-based access control systems that respond dynamically to role and relationship changes, ensuring data is accessible only to authorized individuals. By integrating these solutions, healthcare organizations can effectively harness AI to improve patient care without compromising on security.

Top comments (0)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.

DEV Community