Abigail Afi Gbadago for MongoDB

Posted on Mar 19 • Edited on Apr 28

Build a Secure Risk Assessment API with MongoDB Queryable Encryption in Django

#django #mongodb #encryption

In this tutorial, we will build a Risk Assessment API with encrypted fields using the Django MongoDB Backend. This tutorial targets anyone working with sensitive data that needs to be protected. In this case, we will focus on assessing risk, which is fundamental in sectors such as finance or wealth asset management.

The Problem

Encryption plays a crucial role in mitigating potential threats to investments and ensuring regulatory compliance and long-term profitability of organizations. The major problem arises when sensitive data, such as Personal Identifiable Information (PII), financial, and health records, is stored in plain text in databases, which are vulnerable to exposure in the case of a database breach.

Under the EU’s Digital Operational Resilience Act (DORA), which became effective in January 2025, financial institutions are explicitly mandated to implement robust Information and Communications Technology (ICT) risk management frameworks, which include strict requirements for encryption of data—both at rest, in transit, and in use for operating in European markets, with similar requirements emerging globally.

Why MongoDB Queryable Encryption

MongoDB's Queryable Encryption solves this by encrypting data on the client side before it reaches the server, meaning the database can process queries on encrypted fields without ever seeing the plaintext. With Django MongoDB Backend's Queryable Encryption support, Django developers can protect their data. The fields are encrypted on write, decrypted on read, and the MongoDB server never has access to the raw data, even if compromised.

Queryable Encryption is supported on MongoDB versions as early as MongoDB 7.0 with new query operators for range, prefix, suffix, and substring queries that come with each versioned release. MongoDB 8.0 adds support for range queries on encrypted fields. More advanced text-style queries are being introduced in preview releases prior to general availability and are expected to become fully supported in a future release. To see the full range of supported operations, check here.

Design the Data Model

When deciding the data model for the risk assessment, let’s take a minute to think about the fields in our model.

Things we will encrypt: names, titles, descriptions that contain PII, and sensitive text.
Things we won’t encrypt: Scores, dates, Foreign keys, and anything you need to perform a sort or index on.

The relationship is visualized in the ERD below:

Overview of the Process

When a user makes a request, for example, a write request (POST /api/categories/) with "name": "Financial"} in the request body, Django receives the plaintext "Financial", the EncryptedCharField uses Queryable Encryption to encrypt sensitive data before storing or retrieving it as binary.

Prerequisites

Platform used: MacOS Tahoe 26.3
Python 3.13+
MongoDB Atlas account (free tier works)
Download mongo_crypt_shared library
- You can choose from MongoDB Enterprise Downloads or choose the current release:
- Select your OS; macOS ARM64 / Linux / Windows here
- Extract and note the path to mongo_crypt_v1.dylib (macOS) or mongo_crypt_v1.so (Linux)
- Reference the path in crypt_shared_lib_path in settings.py

Next, create a requirements.txt file with the following content:

Django>=6.0
djangorestframework==3.16.1

# For encryption(depends on corresponding Django version, >=6.0)
django-mongodb-backend[encryption]

# For the API Documentation(Swagger)
drf-spectacular

# For environment variables
Django-environ

Then run the following command in your terminal:
pip install -r requirements.txt

Project Setup

Create your Django project and app:

mkdir querysafe && cd querysafe
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
django-admin startproject querysafe --template https://github.com/mongodb-labs/django-mongodb-project/archive/refs/heads/6.0.x.zip
python manage.py startapp riskapp

Add apps to settings.py

It is ideal to use [MONGODB_USERNAME], [PASSWORD] in an .env file

INSTALLED_APPS = [
    ...
    "rest_framework",
    "drf_spectacular",
    "riskapp",
]

Configure encrypted database

For the (encrypted) database, let's set up a few things first.

Download mongo_crypt_shared

Download from MongoDB Enterprise Downloads or choose the current release here
Extract and note the path to the library file
Reference the path in crypt_shared_lib_path in settings.py (usually looks like Crypt_shared_lib_path: /Users/path/mongo_crypt_shared_v1-macos-arm64-enterprise-8.2.5/lib/mongo_crypt_v1.dylib)
For more information, check here

Now add the encrypted database to your settings.py file under the default database:

# (default)
`
"default": {
    "ENGINE": "django_mongodb_backend",
    "HOST": "MONGO_URI", # "mongodb+srv://[username]:[password]@cluster0.hpljptt.mongodb.net/?appName=devrel-tutorial-python-qe-medium",
    "NAME": "querysafe",
}

# (encrypted)

"encrypted": {
        "ENGINE": "django_mongodb_backend",
        "HOST": "mongodb+srv://[username]:[password]@cluster0.hpljptt.mongodb.net/?appName=devrel-tutorial-python-qe-medium",
        "NAME": "encrypted_db",

        "OPTIONS": {
            "auto_encryption_opts": AutoEncryptionOpts(
                key_vault_namespace="encrypted_db.__keyVault",
                kms_providers={
                    "local": {
                        # Generated by os.urandom(96)
                        "key": (
                            b'-\xc3\x0c\xe3\x93\xc3\x8b\xc0\xf8\x12\xc5#b'
                            b'\x19\xf3\xbc\xccR\xc8\xedI\xda\\ \xfb\x9cB'
                            b'\x7f\xab5\xe7\xb5\xc9x\xb8\xd4d\xba\xdc\x9c'
                            b'\x9a\xdb9J]\xe6\xce\x104p\x079q.=\xeb\x9dK*'
                            b'\x97\xea\xf8\x1e\xc3\xd49K\x18\x81\xc3\x1a"'
                            b'\xdc\x00U\xc4u"X\xe7xy\xa5\xb2\x0e\xbc\xd6+-'
                            b'\x80\x03\xef\xc2\xc4\x9bU'
                        )
                    },
                },
                crypt_shared_lib_path="[PATH/TO/mongo_crypt_v1.dylib]", #for macOS
                crypt_shared_lib_required=True,
            )
        },
    },

}

Now, to visualize the encryption process for our risk data, we will first route operations to the encrypted database by creating a router, define our models, perform migrations, seed the db and then take a look at the encrypted fields in the db.

Create a router

# riskapp/routers.py

class EncryptedRouter:
    def allow_migrate(self, db, app_label, model_name=None, **hints):

        if app_label == "riskapp":
            return db == "encrypted"

        if db == "encrypted":
            return False
        return None

    def db_for_read(self, model, **hints):

        if model._meta.app_label == "riskapp":
            return "encrypted"
        return None

    db_for_write = db_for_read

Define models

from django.db import models
from django.conf import settings
from django_mongodb_backend.models import EmbeddedModel
from django_mongodb_backend.fields import (
    EncryptedCharField,
    EmbeddedModelField,
)

class RiskCategory(EmbeddedModel):
    name = EncryptedCharField(max_length=100)
    description = EncryptedCharField(max_length=500, blank=True)

    def __str__(self):
        return str(self.name)

class RiskOwner(EmbeddedModel):
    username = EncryptedCharField(max_length=150)
    email = EncryptedCharField(max_length=254, blank=True)
    role = EncryptedCharField(max_length=100, blank=True)

    def __str__(self):
        return f"{self.username} ({self.role})"


class Risk(EmbeddedModel):
    title = EncryptedCharField(max_length=200)
    description = EncryptedCharField(max_length=500, blank=True, default="")
    category = EmbeddedModelField(RiskCategory)

    def __str__(self):
        return str(self.title)


class RiskAssessment(models.Model):
    RISK_LEVEL_CHOICES = [
        ("Low", "Low"),
        ("Medium", "Medium"),
        ("High", "High"),
    ]

    risk = EmbeddedModelField(Risk)
    owner = EmbeddedModelField(RiskOwner, blank=True, null=True)
    likelihood = models.CharField(max_length=10, choices=RISK_LEVEL_CHOICES)
    impact = models.CharField(max_length=10, choices=RISK_LEVEL_CHOICES)
    score = models.PositiveSmallIntegerField()
    assessed_on = models.DateField(auto_now_add=True)

    class Meta:
        db_table = "risk_assessments"
        ordering = ["-assessed_on"]
        indexes = [
            models.Index(fields=["likelihood"]),
            models.Index(fields=["impact"]),
            models.Index(fields=["score"]),
        ]

    def __str__(self):
        return f"{self.risk.title} ({self.likelihood}/{self.impact})"

Run migrations

python manage.py makemigrations riskapp
python manage.py migrate --database encrypted

Create serializers for request and response

#serializers.py
from rest_framework import serializers

RISK_LEVEL_CHOICES = ["Low", "Medium", "High"]


class RiskCategorySerializer(serializers.Serializer):

    name = serializers.CharField(max_length=100)
    description = serializers.CharField(
        max_length=500, required=False, default=""
    )


class RiskSerializer(serializers.Serializer):

    title = serializers.CharField(max_length=200)
    description = serializers.CharField(
        max_length=500, required=False, default=""
    )
    category = RiskCategorySerializer()


class RiskOwnerSerializer(serializers.Serializer):

    username = serializers.CharField(max_length=150)
    email = serializers.CharField(max_length=254, required=False, default="")
    role = serializers.CharField(max_length=100, required=False, default="")


class AssessmentSerializer(serializers.Serializer):

    risk = RiskSerializer()
    owner = RiskOwnerSerializer(required=False)
    likelihood = serializers.ChoiceField(choices=RISK_LEVEL_CHOICES)
    impact = serializers.ChoiceField(choices=RISK_LEVEL_CHOICES)
    score = serializers.IntegerField(min_value=1, max_value=10)


class AssessmentResponseSerializer(serializers.Serializer):

    id = serializers.CharField(source='pk', read_only=True)
    risk = RiskSerializer()
    owner = RiskOwnerSerializer()
    likelihood = serializers.CharField()
    impact = serializers.CharField()
    score = serializers.IntegerField()
    assessed_on = serializers.DateField()

Create class-based views with APIView

We will use @extend_schema to provide detailed, accurate documentation for each endpoint.

#views.py

from rest_framework import status
from rest_framework.response import Response
from rest_framework.views import APIView
from drf_spectacular.utils import extend_schema

from riskapp.models import (
    RiskAssessment,
    Risk,
    RiskCategory,
    RiskOwner,
)
from riskapp.serializers import (
    AssessmentSerializer,
    AssessmentResponseSerializer,
)

DB = 'encrypted_db'


class AssessmentView(APIView):

    @extend_schema(
        responses=AssessmentResponseSerializer(many=True),
        summary="List all assessments",
        tags=["Assessments"],
    )
    def get(self, request):
        assessments = RiskAssessment.objects.using(DB).all()
        serializer = AssessmentResponseSerializer(assessments, many=True)
        return Response(serializer.data)

    @extend_schema(
        request=AssessmentSerializer,
        responses=AssessmentResponseSerializer,
        summary="Create an assessment",
        tags=["Assessments"],
    )
    def post(self, request):
        serializer = AssessmentSerializer(data=request.data)
        serializer.is_valid(raise_exception=True)
        data = serializer.validated_data

        risk_data = data['risk']
        category_data = risk_data['category']

        assessment = RiskAssessment(
            risk=Risk(
                title=risk_data['title'],
                description=risk_data.get('description', ''),
                category=RiskCategory(
                    name=category_data['name'],
                    description=category_data.get('description', ''),
                ),
            ),
            likelihood=data['likelihood'],
            impact=data['impact'],
            score=data['score'],
        )

        if data.get('owner'):
            owner_data = data['owner']
            assessment.owner = RiskOwner(
                username=owner_data['username'],
                email=owner_data.get('email', ''),
                role=owner_data.get('role', ''),
            )

        assessment.save(using=DB)

        return Response(
            AssessmentResponseSerializer(assessment).data,
            status=status.HTTP_201_CREATED,
        )

Add urls.py at the project and app level.

#querysafe/urls.py

from django.contrib import admin
from django.urls import path, include
from drf_spectacular.views import SpectacularAPIView, SpectacularSwaggerView

urlpatterns = [
    path('admin/', admin.site.urls),
    path('api/', include('riskapp.urls')),
    path('api/schema/', SpectacularAPIView.as_view(), name='schema'), #for swagger-ui
    path('api/docs/', SpectacularSwaggerView.as_view(url_name='schema'), name='swagger-ui'), #for swagger-ui
]

#riskapp/urls.py

from django.urls import path
from riskapp.views import AssessmentView

urlpatterns = [
    path('assessments/', AssessmentView.as_view(), name='assessments'),
]

Next, we will create a management command structure for custom management commands by:

Creating a folder called management in the riskapp and a folder called commands, which is where we will add a Python script to seed the data. This can be achieved either via the GUI or by running mkdir -p riskapp/management/commands.
Then creating init.py(empty) files in each directory to make them Python packages by touch riskapp/management/__init__.py and touch riskapp/management/commands/__init__.py or by creating them via GUI.

Seed data with the seed_data.py, which contains mock data below.

NB: Because encrypted fields can't be queried, the seed script uses create() with in-memory references and a --target flush flag to stay rerunnable.

#seed_data.py

from django.core.management.base import BaseCommand
from django.db import transaction

from riskapp.models import (
    RiskAssessment,
    Risk,
    RiskCategory,
    RiskOwner,
)
ASSESSMENTS = [
    {
        "risk_title": "NoSQL Injection",
        "risk_description": "Potential NoSQL injection via unsanitised operators",
        "category": "Security",
        "cat_desc": "Security and authentication data",
        "likelihood": "High",
        "impact": "High",
        "score": 9,
    },
    {
        "risk_title": "Data Exposure",
        "risk_description": "Sensitive data returned without field projection",
        "category": "Personal",
        "cat_desc": "Personal identifiable information (PII)",
        "likelihood": "Medium",
        "impact": "High",
        "score": 7,
    },
    {
        "risk_title": "Privilege Escalation",
        "risk_description": "Attempt to modify roles or permissions",
        "category": "Security",
        "cat_desc": "Security and authentication data",
        "likelihood": "Low",
        "impact": "High",
        "score": 6,
    },
    {
        "risk_title": "GDPR Violation",
        "risk_description": "Query accesses EU citizen data without consent check",
        "category": "Compliance",
        "cat_desc": "Regulatory and compliance data",
        "likelihood": "Medium",
        "impact": "Medium",
        "score": 5,
    },
    {
        "risk_title": "PHI Exposure",
        "risk_description": "Protected health information returned in query",
        "category": "Healthcare",
        "cat_desc": "Medical and health records (PHI)",
        "likelihood": "High",
        "impact": "High",
        "score": 9,
    },
    {
        "risk_title": "Credit Card Leak",
        "risk_description": "Payment card data queried without masking",
        "category": "Financial",
        "cat_desc": "Financial data and transactions",
        "likelihood": "Medium",
        "impact": "High",
        "score": 8,
    },
    {
        "risk_title": "Mass Data Export",
        "risk_description": "Query fetches excessive number of documents",
        "category": "Compliance",
        "cat_desc": "Regulatory and compliance data",
        "likelihood": "Low",
        "impact": "Medium",
        "score": 4,
    },
    {
        "risk_title": "Unindexed Scan",
        "risk_description": "Collection scan on sensitive collection",
        "category": "Financial",
        "cat_desc": "Financial data and transactions",
        "likelihood": "High",
        "impact": "Low",
        "score": 3,
    },
]

class Command(BaseCommand):
    """
    Seed the database with sample risk assessment data.
    Use --target flush to clear data before reseeding.
    """

    help = "Seed the database with sample risk assessment data"

    def add_arguments(self, parser):
        parser.add_argument(
            '--target',
            default='all',
            choices=['all', 'default', 'encrypted', 'flush'],
            help="'flush' to clear data, 'all' to seed both, or specify a database",
        )

    def handle(self, *args, **options):
        target = options['target']

        if target == 'flush':
            self._flush('default')
            self._flush('encrypted')
            return

        if target == 'all':
            databases = ['default', 'encrypted']
        else:
            databases = [target]

        try:
            for db in databases:
                self.db = db
                self.stdout.write(f"\nSeeding into: {self.db}")
                with transaction.atomic(using=self.db):
                    self._seed_assessments()
            self.stdout.write(self.style.SUCCESS("\nAll data seeded successfully!"))
        except Exception as e:
            self.stdout.write(self.style.ERROR(f"\nSeeding failed: {e}"))

    def _flush(self, db):
        RiskAssessment.objects.using(db).all().delete()
        self.stdout.write(self.style.WARNING(f"Flushed data from: {db}"))

    def _seed_assessments(self):
        self.stdout.write("\n--- Seeding Risk Assessments ---")

        owner = RiskOwner(
            username="assessor",
            email="assessor@querysafe.com",
            role="Analyst",
        )

        for item in ASSESSMENTS:
            assessment = RiskAssessment(
                risk=Risk(
                    title=item["risk_title"],
                    description=item["risk_description"],
                    category=RiskCategory(
                        name=item["category"],
                        description=item["cat_desc"],
                    ),
                ),
                owner=owner,
                likelihood=item["likelihood"],
                impact=item["impact"],
                score=item["score"],
            )
            assessment.save(using=self.db)
            self.stdout.write(
                f"  Created: {item['risk_title']} ({item['likelihood']}/{item['impact']})"
            )

Run the script file in the terminal
python manage.py seed_data

Swagger Documentation

Next, we add Swagger UI to create a presentation layer for our API.

Install drf-spectacular
pip install drf-spectacular

Next, add it to your INSTALLED_APPS and configure it in settings.py:

INSTALLED_APPS = [
    'querysafe.apps.MongoAdminConfig',
    'querysafe.apps.MongoAuthConfig',
    'querysafe.apps.MongoContentTypesConfig',
    'django.contrib.sessions',
    'django.contrib.messages',
    'django.contrib.staticfiles',
    'django_mongodb_backend',
    'drf_spectacular',
    'riskapp',
]

REST_FRAMEWORK = {
    "DEFAULT_SCHEMA_CLASS": "drf_spectacular.openapi.AutoSchema",
}

SPECTACULAR_SETTINGS = {
    "TITLE": "QuerySafe API",
    "DESCRIPTION": "API for Risk Assessment using Queryable Encryption",
    "VERSION": "1.0.0",
    "SERVE_INCLUDE_SCHEMA": False,
}

Next, run the server to access the Swagger UI at http://localhost:8000/api/docs/

Results

Encrypted Data in MongoDB Atlas

The following screenshots show how sensitive data appears as encrypted binary data when viewed directly in MongoDB Atlas, while the Swagger interface displays the decrypted plaintext for authorized users.

Swagger UI (Auto-Decrypted)

The Swagger UI at http://127.0.0.1:8000/api/docs/ automatically decrypts and displays the data in plaintext for API consumers.

Terminal Query

When querying the database directly from the terminal using find_one(), the data appears as encrypted binary, demonstrating that the database server never has access to the plaintext.

Limitations

With Django MongoDB Backend’s encrypted model fields, you cannot use get_or_create() or filter() — each encryption produces a different ciphertext, so WHERE lookups never match.
Performance cost of client-side encryption/decryption - A read and write to an encrypted field requires client-side encryption/decryption on raw values directly. After fetching the data, your application server must individually decrypt each field for every record. This means that listing 1000 records requires 1000 decryption operations, which adds CPU overhead and slows down response times.
Why you can't encrypt everything:
- Dates — auto_now_add, ordering, and date filtering won't work on encrypted fields, else you can’t auto-populate timestamps (auto_now_add), order results (order_by('-created_at')) or filter by date ranges (filter(created_at__gte=last_week)).
- Scores/numbers — you can't sort or aggregate encrypted values / The database can't perform ORDER BY, SUM(), AVG(), MIN(), MAX(), or range queries (filter(risk_score__gte=80)) on encrypted values because it just sees random ciphertext.
- Indexed fields — CharField with choices like likelihood and impact need to remain queryable, so they can’t be encrypted.

Key Takeaways

Only encrypt fields that hold sensitive data and that you never need to search, sort, or filter by at the database level.
In the event of a database breach, the binary data won't be of any use to attackers, which keeps the integrity of your data safe (MongoDB never sees your plaintext data).

If you liked this tutorial, kindly comment and give me an emoji.

Thank you.

DEV Community