DEV Community

Abigail Afi Gbadago for MongoDB

Posted on

Accessing Data With Django MongoDB Backend: Queries to Aggregation Pipelines

Image showing accessing data

Why use MongoDB with Django?

The MongoDB document model goes well with Django's mission to “encourage rapid development and clean, pragmatic design.” The concept of MongoDB’s document model reflects how developers think about and structure their data in code when it comes to a Django model and a MongoDB document. MongoDB allows Django developers to create their models in a whole different way, especially in modern applications, which include hierarchical, semi-structured, and rapidly evolving data structures. As such, developers won’t need to use too many JOINs in various use cases, as data that is accessed together should be stored together.

While Django was built for RDMS, using non-relational databases such as MongoDB provides advantages such as schema flexibility, aggregation pipelines, dynamic/nested data structures, high speed I/O operations, and easier scaling options since it’s designed to scale horizontally with shards with less overhead.

In this tutorial, we will be accessing and aggregating data using the Django MongoDB Backend.

Setting up Django with MongoDB

Connecting Django to MongoDB

There are different ways of using Python/Django with MongoDB, such as:

  • PyMongo—the native python driver for MongoDB (not an ORM).
  • MongoEngine—a community-backed Object Document Mapper (ODM) for MongoDB which leverages PyMongo.
  • Django MongoDB Backend.

Prerequisites

Installation and setup

To set up and connect to a MongoDB instance using the Django MongoDB Backend, kindly visit the docs and follow through to create a Django project.

After creating your Django project, you should see the “Congratulations!” message and an image of a rocket when you access http://127.0.0.1:8000/ after starting your server.

You should see the quickstart directory, along with a venv folder, mongo_migrations, manage.py, and the README file.

Creating the application in the Django project

In this section, we will:

  • Create an application in our Django project using the sample_mflix database. Take a few minutes to analyze the data in the collection because that will be used to create documents in our Django MongoDB models.py file.
  • Look at a use-case of creating a movie model with embedded awards.
  • Perform CRUD operations: create, query, update nested documents on the model.

Creating the Django app with the custom MongoDB template

  • From the root directory in your Django project, run the following command to create a new Django app (in this case, called cine_flix) based on a custom Django MongoDB template:
python manage.py startapp cine_flix --template 
Enter fullscreen mode Exit fullscreen mode
  • The django-mongodb-app template ensures that your app.py file includes the line
default_auto_field = 'django_mongodb_backend.fields.ObjectIdAutoField' 
Enter fullscreen mode Exit fullscreen mode

because Django natively uses default integer, id whilst MongoDB natively uses ObjectId, _id. As such, if we skip that, we may end up with misaligned primary key types, which will affect the migrations applied or Django admin-related operations. If you did not use the template, make sure you’ve followed the steps in configuring a project to use Django MongoDB Backend.

  • After executing the command, you will see your cine_flix app nested in the root directory.

Look at a use-case using the Movie model which has an embedded array (Genres) as a MongoDB Django model example

We will model our data based on the sample_mflix data.

To do this, open the models.py file in the cine_flix directory and replace its contents with the following code:

from django.db import models
from django.conf import settings
from django_mongodb_backend.fields import EmbeddedModelField, ArrayField
from django_mongodb_backend.models import EmbeddedModel

class Award(EmbeddedModel):
    wins = models.IntegerField(default=0)
    nominations = models.IntegerField(default=0)
    text = models.CharField(max_length=100)

class Movie(models.Model):
    title = models.CharField(max_length=200)
    plot = models.TextField(blank=True)
    runtime = models.IntegerField(default=0)
    released = models.DateTimeField("release date", null=True, blank=True)
    awards = EmbeddedModelField(Award, null=True, blank=True)
    genres = ArrayField(models.CharField(max_length=100), null=True, blank=True)

    class Meta:
        db_table = "movies"
        managed = False

    def __str__(self):
        return self.title

class Viewer(models.Model):
    name = models.CharField(max_length=100)
    email = models.CharField(max_length=200)

    class Meta:
        db_table = "users"
        managed = False

    def __str__(self):
        return self.name
Enter fullscreen mode Exit fullscreen mode

The model contains:

  • An embedded model field named awards, which stores an Award object.
  • An array field named genres, which stores a list of genres that describe the movie.
  • A Viewer model which stores account information for movie viewers.
  • The Award model represents embedded document values stored in the Movie model.

Now, save the models.py file and let’s go populate our views.py file.

Django views

A Django view is a Python function or class that takes an HTTP request and returns an HTTP response—typically rendering data (e.g., from our models.py file) via a template or returning other responses like JSON. Views enable us with data via endpoints that perform CRUD operations (create, read/query, update, and delete nested documents).

In the views.py file in the cine_flix directory, add the following lines of code to create views that display data:

from django.http import HttpResponse
from django.shortcuts import render

from .models import Movie, Viewer

def index(request):
    return HttpResponse("Hello, world.")

def recent_movies(request):
    movies = Movie.objects.order_by("-released")[:5]
    return render(request, "recent_movies.html", {"movies": movies})

def viewers_list(request):
    viewers = Viewer.objects.order_by("name")[:5]
    return render(request, "viewers_list.html", {"viewers": viewers})
Enter fullscreen mode Exit fullscreen mode

These views display a landing page message and information about your Movie and Viewer models.

Configuring URLs for Django views

Create a new file called urls.py file in the root of your cine_flix directory. To connect the views we created in the previous step to URLs, paste the following code into urls.py.

from django.urls import path

from . import views

urlpatterns = [
    path("recent_movies/", views.recent_movies, name="recent_movies"),
    path("viewers_list/", views.viewers_list, name="viewers_list"),
    path("", views.index, name="index"),
]
Enter fullscreen mode Exit fullscreen mode

After that, find the other urls.py file in the project folder—which, in this case, is quickstart, so the path will be the quickstart/urls.py file. Then, add this line of code and save the file:

path("", include("cine_flix.urls")),
Enter fullscreen mode Exit fullscreen mode

And the final urls.py will look like this:

from django.contrib import admin
from django.urls import include, path

urlpatterns = [
    path("admin/", admin.site.urls),
    path("", include("cine_flix.urls")),
]
Enter fullscreen mode Exit fullscreen mode

Django templates

Templates act as a separation between your Django application logic and presentation of data. They render data in the form of UI for users to interact with.

In your cine_flix app directory, create a subdirectory called templates, which will store all data which will be rendered via HTML files.

Then, create a file called recent_movies.html, paste the following code, and save:

<!-- templates/recent_movies.html -->
<!DOCTYPE html>
<html lang="en">
<head>
   <meta charset="UTF-8">
   <meta name="viewport" content="width=device-width, initial-scale=1.0">
   <title>Recent Movies</title>
</head>
<body>
   <h1>Five Most Recent Movies</h1>
   <ul>
      {% for movie in movies %}
            <li>
               <strong>{{ movie.title }}</strong> (Released: {{ movie.released }})
            </li>
      {% empty %}
            <li>No movies found.</li>
      {% endfor %}
   </ul>
</body>
</html>
Enter fullscreen mode Exit fullscreen mode

This template formats the movie data requested by the recent_movies view.

After, create another file in the cine_flix/templates directory called viewers_list.html, paste the following code, and save:

<!-- templates/viewers_list.html -->
<!DOCTYPE html>
<html lang="en">
<head>
   <meta charset="UTF-8">
   <meta name="viewport" content="width=device-width, initial-scale=1.0">
   <title>Viewers List</title>
</head>
<body>
   <h1>Alphabetical Viewers List</h1>
   <table>
      <thead>
            <tr>
               <th>Name</th>
               <th>Email</th>
            </tr>
      </thead>
      <tbody>
            {% for viewer in viewers %}
               <tr>
                  <td>{{ viewer.name }}</td>
                  <td>{{ viewer.email }}</td>
               </tr>
            {% empty %}
               <tr>
                  <td colspan="2">No viewer found.</td>
               </tr>
            {% endfor %}
      </tbody>
   </table>
</body>
</html>
Enter fullscreen mode Exit fullscreen mode

This template formats the user data requested by the viewers_list view.

Register the cine_flix app

At this point, we need to tell Django to register the cine_flix app, from models and migrations to templates and admin. To do that, open the settings.py file in the quickstart project and add this line of code to the top of the INSTALLED_APPS setting to resemble the following code:

INSTALLED_APPS = [
    ''cine_flix.apps.CineFlixConfig',
    'quickstart.apps.MongoAdminConfig',
    'quickstart.apps.MongoAuthConfig',
    'quickstart.apps.MongoContentTypesConfig',
    'django.contrib.sessions',
    'django.contrib.messages',
    'django.contrib.staticfiles',
]
Enter fullscreen mode Exit fullscreen mode

Run migrations

From your project root (in your terminal, you should run these commands in the project directory), create migrations for the Movie, Award, and Viewer models and apply the changes to the database:

python manage.py makemigrations cine_flix
python manage.py migrate
Enter fullscreen mode Exit fullscreen mode

After, you should see the migrations applied with OK statuses.

At this point, we have a basic Django MongoDB Backend app that can access the sample_mflix MongoDB Atlas database.

Now, let’s create and interact with data via CRUD operations.

Perform CRUD operations: create, read/query, update, delete nested documents

To perform CRUD operations, we need to use the Python interactive shell on our model objects.

To start the Python shell, run the following command in the terminal:

python manage.py shell
Enter fullscreen mode Exit fullscreen mode

Then import the required classes and modules with the following lines of code from your Python shell:

from cine_flix.models import Movie, Award, Viewer
from django.utils import timezone
from datetime import datetime
Enter fullscreen mode Exit fullscreen mode

Create a Movie object

Example:

movie_awards = Award(wins=9, nominations=11, text="Won 7 Oscars")
movie = Movie.objects.create(
title="Everything Everywhere All at Once",
plot="Evelyn Wang, a Chinese‑American laundromat owner, finds herself swept into a multiversal adventure to prevent the collapse of existence, exploring alternate lives she could’ve led—all while dealing with family dysfunction and mid‑life malaise",
runtime=132,
released=timezone.make_aware(datetime(2024, 6, 22)),
awards=movie_awards,
genres=["Adventure", "Comedy"]
)

movie_awards = Award(wins=14, nominations=6, text="Won 8 Oscars")
movie2 = Movie.objects.create(
title="Mad Max: Fury Road",
plot="In a brutal post‑apocalyptic wasteland, Imperator Furiosa helps five women escape from a tyrant; she joins forces with Max Rockatansky in a high-octane chase across the desert in the armored War Rig",
runtime=120,
released=timezone.make_aware(datetime(2015, 5, 15)),
awards=movie_awards,
genres=["Action", "Adventure"]
)
Enter fullscreen mode Exit fullscreen mode

Insert a Viewer object into the database

You can also use your Viewer model to insert documents into the sample_mflix.users collection. Run the following code to create a Viewer object that stores data about a movie viewer named "Alex Carter":

viewer = Viewer.objects.create(
    name="Alex Carter",
    email="Alex Carter@fakegmail.com"
)

viewer = Viewer.objects.create(
    name="Solomon Cruz",
    email="solomoncruz@fakegmail.com"
)
Enter fullscreen mode Exit fullscreen mode

Query (Read)

Viewer.objects.filter(email="alexcarter@fakegmail.com").first()
Enter fullscreen mode Exit fullscreen mode

Update

movie.runtime = 117
movie.save()

We can verify the update by running:
>>> query = Movie.objects.filter(runtime=117)
>>> print(query)
Enter fullscreen mode Exit fullscreen mode

This returns ]> as the result since the movie Everything Everywhere All at Once had its runtime updated from 132 to 117.

Delete

One movie viewer named "Solomon Cruz" no longer uses the movie streaming site. To remove this viewer's corresponding document from the database, run the following code:

old_viewer = Viewer.objects.filter(name="Solomon Cruz").first()
old_viewer.delete()
Enter fullscreen mode Exit fullscreen mode

Exiting the Python Shell

exit()
Enter fullscreen mode Exit fullscreen mode

Running your server

python manage.py runserver
Enter fullscreen mode Exit fullscreen mode

Render your new objects

To ensure that you inserted a Movie object into the database, visit http://127.0.0.1:8000/recent_movies/. You can see the two movies we added via the shell.

Then, ensure that you inserted a Viewer object into the database by visiting http://127.0.0.1:8000/viewers_list/. You can see the one viewer name because we deleted one in the previous step.

Querying embedded models and arrays

  • Filter with the dot notation on an embedded model. Dot notation example:
>>> selected_movie = Movie.objects.filter(awards__wins__gt=7)
>>> print(selected_movie)
>>> <QuerySet [<Movie: Everything Everywhere All at Once>, <Movie: Mad Max: Fury Road>]>
Enter fullscreen mode Exit fullscreen mode

The output is Mad Max: Fury Road since it has an award win of 14. The double underscores act just like dot notation internally with Django.

Image showing querying via the dot notation/double underscores

  • Find the specific genre within the Genres array.
>>> Movie.objects.filter(genres__contains=["Action"])
>>> <MongoQuerySet [<Movie: Mad Max: Fury Road>]>
Enter fullscreen mode Exit fullscreen mode

The output is ]> since it matches the Movie object containing the value of Action in the Genres array.

  • Find more than one genre within a movie object.
>>> Movie.objects.filter(genres__contains=["Comedy", "Adventure"])
>>> <MongoQuerySet [<Movie: Everything Everywhere All at Once>]> 
The output is <MongoQuerySet [<Movie: Everything Everywhere All at Once>]>
Enter fullscreen mode Exit fullscreen mode
  • Update the movie object with a specific genre.
>>> updated_genre = Movie.objects.filter(title="Everything Everywhere All at   Once").update(genres=["Thriller"])

Enter fullscreen mode Exit fullscreen mode

To verify, run:

Movie.objects.filter(title="Everything Everywhere All at Once").values('title', 'genres').first()

Enter fullscreen mode Exit fullscreen mode

The output is {'title': 'Everything Everywhere All at Once', 'genres': ['Thriller']}.

Challenges

As the data grows, there may be challenges in filtering, sorting, or aggregating data within large arrays. As such, try to restructure large arrays by separating or capping them, and specific subfields should be indexed to enable faster lookups to reduce slower queries.

Overview of MongoDB’s aggregation pipeline

MongoDB’s aggregation pipeline consists of one or more stages that process documents. Each stage performs an operation on the input documents. For example, a stage can filter documents, group documents, and calculate values. The documents that are output from a stage are passed to the next stage.

You may be familiar with the aggregation pipeline, however, that has been included on the Django MongoDB Backend now. This allows you to use our MongoDB specific operations outside of the constraints of typical Django lookup queries.

Examples of such cases are:

  • Performing lookups using the $near geospatial Operator in MongoDB, but it’s not available in Django.

  • Hybrid Search which couples the precision of text search with the semantic understanding of vector search to deliver the most relevant answers to end users.

  • Hyper-optimized queries which use MongoDB native operators to improve the efficiency of read operations by reducing the amount of data that query operations need to process.

Raw MongoDB Django aggregation example

Let’s run an aggregation pipeline to determine the average runtime of the movies in our collection.

from cine_flix.models import Movie

pipeline = [
  {"$match": {"genres": "Adventure"}},
  {"$group": {"_id": None, "avg_runtime": {"$avg": "$runtime"}}}
]

agg_query = Movie.objects.raw_aggregate(pipeline)

for result in agg_query:
    print("Average runtime:", result.avg_runtime)
Enter fullscreen mode Exit fullscreen mode

Image showing querying via an aggregation pipeline

Using the Django MongoDB Backend views, you can also implement the pipeline inside your views.py by adding these lines of code to your views.py:

def avg_runtime_view(request):
    pipeline = [
        {"$match": {"genres": "Adventure"}},
        {"$group": {"_id": None, "avg_runtime": {"$avg": "$runtime"}}}
    ]

    agg_query = Movie.objects.raw_aggregate(pipeline)
    avg = agg_query[0].avg_runtime if agg_query else None

   return render(request, "average_runtime.html", {"avg_runtime": avg})
Enter fullscreen mode Exit fullscreen mode

After, create an average_runtime.html template in the Templates folder and add these lines of code:

<!DOCTYPE html>
<html lang="en">
<head>
   <meta charset="UTF-8">
   <meta name="viewport" content="width=device-width, initial-scale=1.0">
   <title>Recent Movies</title>
</head>

<body>

    <h1>Average Runtime for Adventure Movies</h1>

    {% if avg_runtime is not None %}
      <p>The average runtime is <strong>{{ avg_runtime }}</strong> minutes.</p>
    {% else %}
      <p>No data found.</p>
    {% endif %}
</body>
</html>

Enter fullscreen mode Exit fullscreen mode

Lastly, I added the URL to the app’s urls.py file.

   path("avg-runtime/", views.avg_runtime_view, name="avg_runtime"),
Enter fullscreen mode Exit fullscreen mode

Then, run the server using Python manage.py runserver. We see the output displayed in the image below of the average runtime result from the pipeline.

Image showing average runtime for adventure movies

Best practices when working with data

  • Endeavor to keep embedded arrays in a manageable size to avoid bloated documents.
  • Model your data properly to suit your query patterns so that aggregation or pipeline stages work efficiently.
  • Use caching and pagination for aggregated data to reduce server load and increase performance with large datasets.

Conclusion

In this tutorial, we looked at accessing data with the Django MongoDB Backend, setting up and installation of the Django MongoDB Backend, creating an application within a Django project, CRUD methods on the Movie object, challenges, raw aggregation and best practices when working with data.

Using the Django MongoDB Backend to access data is an efficient way to work with data—related data is stored together (even with varying document structures)—so try it out and let me know.

If you have any questions, feel free to reach out to our community: MongoDB Community Forum.

Resources

Top comments (0)