DEV Community

Cover image for Queryable Encryption with Django MongoDB Backend
Abigail Afi Gbadago for MongoDB

Posted on

Queryable Encryption with Django MongoDB Backend

In this tutorial, we will build a Risk Assessment API with encrypted fields using the Django MongoDB Backend. This tutorial targets anyone working with sensitive data that needs to be protected. In this case, we will focus on assessing risk, which is fundamental in sectors such as finance or wealth asset management.

The Problem

Encryption plays a crucial role in mitigating potential threats to investments and ensuring regulatory compliance and long-term profitability of organizations. The major problem arises when sensitive data, such as Personal Identifiable Information (PII), financial, and health records, is stored in plain text in databases, which are vulnerable to exposure in the case of a database breach.

Under the EU’s Digital Operational Resilience Act (DORA), which became effective in January 2025, financial institutions are explicitly mandated to implement robust Information and Communications Technology (ICT) risk management frameworks, which include strict requirements for encryption of data—both at rest, in transit, and in use for operating in European markets, with similar requirements emerging globally.

Why MongoDB Queryable Encryption

MongoDB's Queryable Encryption solves this by encrypting data on the client side before it reaches the server, meaning the database can process queries on encrypted fields without ever seeing the plaintext. With Django MongoDB Backend's Queryable Encryption support, Django developers can protect their data. The fields are encrypted on write, decrypted on read, and the MongoDB server never has access to the raw data, even if compromised.

Queryable Encryption is supported on MongoDB versions as early as MongoDB 7.0 with new query operators for range, prefix, suffix, and substring queries that come with each versioned release. MongoDB 8.0 adds support for range queries on encrypted fields. More advanced text-style queries are being introduced in preview releases prior to general availability and are expected to become fully supported in a future release. To see the full range of supported operations, check here.

Design the Data Model

When deciding the data model for the risk assessment, let’s take a minute to think about the fields in our model.

  • Things we will encrypt: Names, titles, descriptions that contain PII, and sensitive text.

  • Things we won’t encrypt: Scores, dates, Foreign keys, and anything you need to perform a sort or index on.

The relationship is visualized in the ERD below:

ERD - Embedded Document

Overview of the Process

When a user makes a request, for example, a write request (POST /api/categories/) with "name": "Financial"} in the request body, Django receives the plaintext "Financial", the EncryptedCharField uses Queryable Encryption to encrypt sensitive data before storing or retrieving it as binary.

Prerequisites

  • Platform used: MacOS Tahoe 26.3
  • Python 3.13+
  • MongoDB Atlas account (free tier works)
  • Download mongo_crypt_shared library
    • You can choose from MongoDB Enterprise Downloads or choose the current release:
    • Select your OS; macOS ARM64 / Linux / Windows here
    • Extract and note the path to mongo_crypt_v1.dylib (macOS) or mongo_crypt_v1.so (Linux)
    • Reference the path in crypt_shared_lib_path in settings.py

Next, create a requirements.txt file with the following content:

Django>=6.0
djangorestframework==3.16.1

# For encryption(depends on corresponding Django version, >=6.0)
django-mongodb-backend[encryption]

# For the API Documentation(Swagger)
drf-spectacular

# For environment variables
Django-environ
Enter fullscreen mode Exit fullscreen mode

Then run the following command in your terminal:
pip install -r requirements.txt

Project Setup

Create your Django project and app:

Add apps to settings.py

It is ideal to use [MONGODB_USERNAME], [PASSWORD] in an .env file

INSTALLED_APPS = [
    ...
    "rest_framework",
    "drf_spectacular",
    "riskapp",
]
Enter fullscreen mode Exit fullscreen mode

Configure encrypted database

For the (encrypted) database, let's set up a few things first.

Download mongo_crypt_shared

  • Download from MongoDB Enterprise Downloads or choose the current release here
  • Extract and note the path to the library file
  • Reference the path in crypt_shared_lib_path in settings.py (usually looks like Crypt_shared_lib_path: /Users/path/mongo_crypt_shared_v1-macos-arm64-enterprise-8.2.5/lib/mongo_crypt_v1.dylib)
  • For more information, check here

Now add the encrypted database to your settings.py file under the default database:

# (default)
`
"default": {
    "ENGINE": "django_mongodb_backend",
    "HOST": "MONGO_URI", # "mongodb+srv://[username]:[password]@cluster0.hpljptt.mongodb.net/?appName=devrel-tutorial-python-qe-medium",
    "NAME": "querysafe",
}
Enter fullscreen mode Exit fullscreen mode
# (encrypted)

"encrypted": {
        "ENGINE": "django_mongodb_backend",
        "HOST": "mongodb+srv://[username]:[password]@cluster0.hpljptt.mongodb.net/?appName=devrel-tutorial-python-qe-medium",
        "NAME": "encrypted_db",

        "OPTIONS": {
            "auto_encryption_opts": AutoEncryptionOpts(
                key_vault_namespace="encrypted_db.__keyVault",
                kms_providers={
                    "local": {
                        # Generated by os.urandom(96)
                        "key": (
                            b'-\xc3\x0c\xe3\x93\xc3\x8b\xc0\xf8\x12\xc5#b'
                            b'\x19\xf3\xbc\xccR\xc8\xedI\xda\\ \xfb\x9cB'
                            b'\x7f\xab5\xe7\xb5\xc9x\xb8\xd4d\xba\xdc\x9c'
                            b'\x9a\xdb9J]\xe6\xce\x104p\x079q.=\xeb\x9dK*'
                            b'\x97\xea\xf8\x1e\xc3\xd49K\x18\x81\xc3\x1a"'
                            b'\xdc\x00U\xc4u"X\xe7xy\xa5\xb2\x0e\xbc\xd6+-'
                            b'\x80\x03\xef\xc2\xc4\x9bU'
                        )
                    },
                },
                crypt_shared_lib_path="[PATH/TO/mongo_crypt_v1.dylib]", #for macOS
                crypt_shared_lib_required=True,
            )
        },
    },

}
Enter fullscreen mode Exit fullscreen mode

Now, to visualize the encryption process for our risk data, we will first route operations to the encrypted database by creating a router, define our models, perform migrations, seed the db and then take a look at the encrypted fields in the db.

Create a router

# riskapp/routers.py

class EncryptedRouter:
    def allow_migrate(self, db, app_label, model_name=None, **hints):

        if app_label == "riskapp":
            return db == "encrypted"

        if db == "encrypted":
            return False
        return None

    def db_for_read(self, model, **hints):

        if model._meta.app_label == "riskapp":
            return "encrypted"
        return None

    db_for_write = db_for_read
Enter fullscreen mode Exit fullscreen mode

Define models

from django.db import models
from django.conf import settings
from django_mongodb_backend.models import EmbeddedModel
from django_mongodb_backend.fields import (
    EncryptedCharField,
    EmbeddedModelField,
)

class RiskCategory(EmbeddedModel):
    name = EncryptedCharField(max_length=100)
    description = EncryptedCharField(max_length=500, blank=True)

    def __str__(self):
        return str(self.name)

class RiskOwner(EmbeddedModel):
    username = EncryptedCharField(max_length=150)
    email = EncryptedCharField(max_length=254, blank=True)
    role = EncryptedCharField(max_length=100, blank=True)

    def __str__(self):
        return f"{self.username} ({self.role})"


class Risk(EmbeddedModel):
    title = EncryptedCharField(max_length=200)
    description = EncryptedCharField(max_length=500, blank=True, default="")
    category = EmbeddedModelField(RiskCategory)

    def __str__(self):
        return str(self.title)


class RiskAssessment(models.Model):
    RISK_LEVEL_CHOICES = [
        ("Low", "Low"),
        ("Medium", "Medium"),
        ("High", "High"),
    ]

    risk = EmbeddedModelField(Risk)
    owner = EmbeddedModelField(RiskOwner, blank=True, null=True)
    likelihood = models.CharField(max_length=10, choices=RISK_LEVEL_CHOICES)
    impact = models.CharField(max_length=10, choices=RISK_LEVEL_CHOICES)
    score = models.PositiveSmallIntegerField()
    assessed_on = models.DateField(auto_now_add=True)

    class Meta:
        db_table = "risk_assessments"
        ordering = ["-assessed_on"]
        indexes = [
            models.Index(fields=["likelihood"]),
            models.Index(fields=["impact"]),
            models.Index(fields=["score"]),
        ]

    def __str__(self):
        return f"{self.risk.title} ({self.likelihood}/{self.impact})"

Enter fullscreen mode Exit fullscreen mode

Run migrations

python manage.py makemigrations riskapp
python manage.py migrate --database encrypted

Create serializers for request and response

#serializers.py
from rest_framework import serializers

RISK_LEVEL_CHOICES = ["Low", "Medium", "High"]


class RiskCategorySerializer(serializers.Serializer):

    name = serializers.CharField(max_length=100)
    description = serializers.CharField(
        max_length=500, required=False, default=""
    )


class RiskSerializer(serializers.Serializer):

    title = serializers.CharField(max_length=200)
    description = serializers.CharField(
        max_length=500, required=False, default=""
    )
    category = RiskCategorySerializer()


class RiskOwnerSerializer(serializers.Serializer):

    username = serializers.CharField(max_length=150)
    email = serializers.CharField(max_length=254, required=False, default="")
    role = serializers.CharField(max_length=100, required=False, default="")


class AssessmentSerializer(serializers.Serializer):

    risk = RiskSerializer()
    owner = RiskOwnerSerializer(required=False)
    likelihood = serializers.ChoiceField(choices=RISK_LEVEL_CHOICES)
    impact = serializers.ChoiceField(choices=RISK_LEVEL_CHOICES)
    score = serializers.IntegerField(min_value=1, max_value=10)


class AssessmentResponseSerializer(serializers.Serializer):

    id = serializers.CharField(source='pk', read_only=True)
    risk = RiskSerializer()
    owner = RiskOwnerSerializer()
    likelihood = serializers.CharField()
    impact = serializers.CharField()
    score = serializers.IntegerField()
    assessed_on = serializers.DateField()
Enter fullscreen mode Exit fullscreen mode

Create class-based views with APIView

We will use @extend_schema to provide detailed, accurate documentation for each endpoint.

#views.py

from rest_framework import status
from rest_framework.response import Response
from rest_framework.views import APIView
from drf_spectacular.utils import extend_schema

from riskapp.models import (
    RiskAssessment,
    Risk,
    RiskCategory,
    RiskOwner,
)
from riskapp.serializers import (
    AssessmentSerializer,
    AssessmentResponseSerializer,
)

DB = 'encrypted_db'


class AssessmentView(APIView):

    @extend_schema(
        responses=AssessmentResponseSerializer(many=True),
        summary="List all assessments",
        tags=["Assessments"],
    )
    def get(self, request):
        assessments = RiskAssessment.objects.using(DB).all()
        serializer = AssessmentResponseSerializer(assessments, many=True)
        return Response(serializer.data)

    @extend_schema(
        request=AssessmentSerializer,
        responses=AssessmentResponseSerializer,
        summary="Create an assessment",
        tags=["Assessments"],
    )
    def post(self, request):
        serializer = AssessmentSerializer(data=request.data)
        serializer.is_valid(raise_exception=True)
        data = serializer.validated_data

        risk_data = data['risk']
        category_data = risk_data['category']

        assessment = RiskAssessment(
            risk=Risk(
                title=risk_data['title'],
                description=risk_data.get('description', ''),
                category=RiskCategory(
                    name=category_data['name'],
                    description=category_data.get('description', ''),
                ),
            ),
            likelihood=data['likelihood'],
            impact=data['impact'],
            score=data['score'],
        )

        if data.get('owner'):
            owner_data = data['owner']
            assessment.owner = RiskOwner(
                username=owner_data['username'],
                email=owner_data.get('email', ''),
                role=owner_data.get('role', ''),
            )

        assessment.save(using=DB)

        return Response(
            AssessmentResponseSerializer(assessment).data,
            status=status.HTTP_201_CREATED,
        )

Enter fullscreen mode Exit fullscreen mode

Add urls.py at the project and app level.

#querysafe/urls.py

from django.contrib import admin
from django.urls import path, include
from drf_spectacular.views import SpectacularAPIView, SpectacularSwaggerView

urlpatterns = [
    path('admin/', admin.site.urls),
    path('api/', include('riskapp.urls')),
    path('api/schema/', SpectacularAPIView.as_view(), name='schema'), #for swagger-ui
    path('api/docs/', SpectacularSwaggerView.as_view(url_name='schema'), name='swagger-ui'), #for swagger-ui
]

Enter fullscreen mode Exit fullscreen mode
#riskapp/urls.py

from django.urls import path
from riskapp.views import AssessmentView

urlpatterns = [
    path('assessments/', AssessmentView.as_view(), name='assessments'),
]
Enter fullscreen mode Exit fullscreen mode

Next, we will create a management command structure for custom management commands by:

  • Creating a folder called management in the riskapp and a folder called commands, which is where we will add a Python script to seed the data. This can be achieved either via the GUI or by running mkdir -p riskapp/management/commands.
  • Then creating init.py(empty) files in each directory to make them Python packages by touch riskapp/management/__init__.py and touch riskapp/management/commands/__init__.py or by creating them via GUI.

Seed data with the seed_data.py, which contains mock data below.

NB: Because encrypted fields can't be queried, the seed script uses create() with in-memory references and a --target flush flag to stay rerunnable.

#seed_data.py

from django.core.management.base import BaseCommand
from django.db import transaction

from riskapp.models import (
    RiskAssessment,
    Risk,
    RiskCategory,
    RiskOwner,
)
ASSESSMENTS = [
    {
        "risk_title": "NoSQL Injection",
        "risk_description": "Potential NoSQL injection via unsanitised operators",
        "category": "Security",
        "cat_desc": "Security and authentication data",
        "likelihood": "High",
        "impact": "High",
        "score": 9,
    },
    {
        "risk_title": "Data Exposure",
        "risk_description": "Sensitive data returned without field projection",
        "category": "Personal",
        "cat_desc": "Personal identifiable information (PII)",
        "likelihood": "Medium",
        "impact": "High",
        "score": 7,
    },
    {
        "risk_title": "Privilege Escalation",
        "risk_description": "Attempt to modify roles or permissions",
        "category": "Security",
        "cat_desc": "Security and authentication data",
        "likelihood": "Low",
        "impact": "High",
        "score": 6,
    },
    {
        "risk_title": "GDPR Violation",
        "risk_description": "Query accesses EU citizen data without consent check",
        "category": "Compliance",
        "cat_desc": "Regulatory and compliance data",
        "likelihood": "Medium",
        "impact": "Medium",
        "score": 5,
    },
    {
        "risk_title": "PHI Exposure",
        "risk_description": "Protected health information returned in query",
        "category": "Healthcare",
        "cat_desc": "Medical and health records (PHI)",
        "likelihood": "High",
        "impact": "High",
        "score": 9,
    },
    {
        "risk_title": "Credit Card Leak",
        "risk_description": "Payment card data queried without masking",
        "category": "Financial",
        "cat_desc": "Financial data and transactions",
        "likelihood": "Medium",
        "impact": "High",
        "score": 8,
    },
    {
        "risk_title": "Mass Data Export",
        "risk_description": "Query fetches excessive number of documents",
        "category": "Compliance",
        "cat_desc": "Regulatory and compliance data",
        "likelihood": "Low",
        "impact": "Medium",
        "score": 4,
    },
    {
        "risk_title": "Unindexed Scan",
        "risk_description": "Collection scan on sensitive collection",
        "category": "Financial",
        "cat_desc": "Financial data and transactions",
        "likelihood": "High",
        "impact": "Low",
        "score": 3,
    },
]

class Command(BaseCommand):
    """
    Seed the database with sample risk assessment data.
    Use --target flush to clear data before reseeding.
    """

    help = "Seed the database with sample risk assessment data"

    def add_arguments(self, parser):
        parser.add_argument(
            '--target',
            default='all',
            choices=['all', 'default', 'encrypted', 'flush'],
            help="'flush' to clear data, 'all' to seed both, or specify a database",
        )

    def handle(self, *args, **options):
        target = options['target']

        if target == 'flush':
            self._flush('default')
            self._flush('encrypted')
            return

        if target == 'all':
            databases = ['default', 'encrypted']
        else:
            databases = [target]

        try:
            for db in databases:
                self.db = db
                self.stdout.write(f"\nSeeding into: {self.db}")
                with transaction.atomic(using=self.db):
                    self._seed_assessments()
            self.stdout.write(self.style.SUCCESS("\nAll data seeded successfully!"))
        except Exception as e:
            self.stdout.write(self.style.ERROR(f"\nSeeding failed: {e}"))

    def _flush(self, db):
        RiskAssessment.objects.using(db).all().delete()
        self.stdout.write(self.style.WARNING(f"Flushed data from: {db}"))

    def _seed_assessments(self):
        self.stdout.write("\n--- Seeding Risk Assessments ---")

        owner = RiskOwner(
            username="assessor",
            email="assessor@querysafe.com",
            role="Analyst",
        )

        for item in ASSESSMENTS:
            assessment = RiskAssessment(
                risk=Risk(
                    title=item["risk_title"],
                    description=item["risk_description"],
                    category=RiskCategory(
                        name=item["category"],
                        description=item["cat_desc"],
                    ),
                ),
                owner=owner,
                likelihood=item["likelihood"],
                impact=item["impact"],
                score=item["score"],
            )
            assessment.save(using=self.db)
            self.stdout.write(
                f"  Created: {item['risk_title']} ({item['likelihood']}/{item['impact']})"
            )
Enter fullscreen mode Exit fullscreen mode

Run the script file in the terminal
python manage.py seed_data

Swagger Documentation
Next, we add Swagger UI to create a presentation layer for our API.

Install drf-spectacular
pip install drf-spectacular

Next, add it to your INSTALLED_APPS and configure it in settings.py:

INSTALLED_APPS = [
    'querysafe.apps.MongoAdminConfig',
    'querysafe.apps.MongoAuthConfig',
    'querysafe.apps.MongoContentTypesConfig',
    'django.contrib.sessions',
    'django.contrib.messages',
    'django.contrib.staticfiles',
    'django_mongodb_backend',
    'drf_spectacular',
    'riskapp',
]

REST_FRAMEWORK = {
    "DEFAULT_SCHEMA_CLASS": "drf_spectacular.openapi.AutoSchema",
}

SPECTACULAR_SETTINGS = {
    "TITLE": "QuerySafe API",
    "DESCRIPTION": "API for Risk Assessment using Queryable Encryption",
    "VERSION": "1.0.0",
    "SERVE_INCLUDE_SCHEMA": False,
}
Enter fullscreen mode Exit fullscreen mode

Next, run the server to access the Swagger UI at http://localhost:8000/api/docs/

Screenshot: MongoDB Atlas showing encrypted data(risk assessments)

Screenshot of MongoDB Atlas showing risk assessments as encrypted

[Screenshot of MongoDB Atlas showing risk assessments as encrypted]

Screenshot of MongoDB Atlas showing encrypted embedded models and fields

[Screenshot of MongoDB Atlas showing encrypted embedded models and fields]

Screenshot: Swagger showing plain text (auto-decrypted)

Endpoint: http://127.0.0.1:8000/api/docs/

GET method
[GET method]

POST method

POST method
[POST method]

Screenshot of terminal showing encrypted data(binary) stored in MDB database
[Screenshot of terminal showing encrypted data(binary) stored in MDB database]

After querying via terminal using find_one(), the data is shown as encrypted (represented in binary).

Limitations

  • With Django MongoDB Backend’s encrypted model fields, you cannot use get_or_create() or filter() — each encryption produces a different ciphertext, so WHERE lookups never match.
  • Performance cost of client-side encryption/decryption - A read and write to an encrypted field requires client-side encryption/decryption on raw values directly. After fetching the data, your application server must individually decrypt each field for every record. This means that listing 1000 records requires 1000 decryption operations, which adds CPU overhead and slows down response times.
  • Why you can't encrypt everything:
    • Dates — auto_now_add, ordering, and date filtering won't work on encrypted fields, else you can’t auto-populate timestamps (auto_now_add), order results (order_by('-created_at')) or filter by date ranges (filter(created_at__gte=last_week)).
    • Scores/numbers — you can't sort or aggregate encrypted values / The database can't perform ORDER BY, SUM(), AVG(), MIN(), MAX(), or range queries (filter(risk_score__gte=80)) on encrypted values because it just sees random ciphertext.
    • Indexed fields — CharField with choices like likelihood and impact need to remain queryable, so they can’t be encrypted.

Key Takeaways

  • Only encrypt fields that hold sensitive data and that you never need to search, sort, or filter by at the database level.
  • In the event of a database breach, the binary data won't be of any use to attackers, which keeps the integrity of your data safe(MongoDB never sees your plaintext data).

References

Top comments (0)