When you design a database for a large product, it is inevitable to arrive at a point where you have two models that are related to each other in a way that does not get solved using a ForeignKey alone.
A good example of a many-to-many relationship is the relationship between a sandwich and a sauce. I like a chicken teriyaki sandwich but only if it contains barbeque sauce as well as mayonnaise sauce. So the same sandwich can have multiple sauces. At the same time I want mayonnaise sauce to appear on a turkey sandwich as well, so the same sauce can be used on different kinds of sandwiches.
This is a great place to use the ManyToManyField offered by Django instead of a regular ForeignKey. Unfortunately the way Django deals with this is a bit unintuitive and can be confusing, so I thought it would be best to demonstrate how a many to many relationship works under the hood.
Many-to-many relationship in a database
To maintain a many-to-many relationship between two tables in a database, the only way is to have a third table which has references to both of those tables. This table is called a “through” table and each entry in this table will connect the source table (sandwich in this case) and the target table (sauce in this case).
This is exactly what Django does under the hood when you use a ManyToManyField. It creates a through model which is not visible to the ORM user and whenever one needs to fetch all of the sandwiches that use a particular sauce given only the name of the sauce, the above 3 tables are joined.
Joining 3 tables may not be very efficient, so if you query this information using the sauce ID instead of name, Django internally Joins only 2 tables (sandwiches_sauces and sandwiches). These join operations are invisible to the user but it helps to know what’s going on in the database so that the queries can be made as efficient as possible.
Using the ManyToManyField
Now that we know what happens internally, let’s look at how the ManyToManyField helps abstract out all of this complexity and provides a simple interface to the person writing code.
Let us create a new Django project and an app within it called sandwiches. In models.py, define these two models.
from django.db import models
class Sauce(models.Model):
name = models.CharField(max_length=100)
def __str__(self):
return self.name
class Sandwich(models.Model):
name = models.CharField(max_length=100)
sauces = models.ManyToManyField(Sauce)
def __str__(self):
return self.name
And that’s it. Running a database migration on these models will create a Sandwich table, a Sauce table and a through table connecting the two. Let us now look at how we can access this information using the Django ORM.
Fetching all sandwiches and sauces using each other
Let us create some data in the Sandwich and Sauce models and see how we can retrieve them.
>>> chicken_teriyaki_sandwich = Sandwich.objects.create(name="Chicken Teriyaki Sandwich")
>>> bbq_sauce = Sauce.objects.create(name="Barbeque")
>>> mayo_sauce = Sauce.objects.create(name="Mayonnaise")
>>>
>>> chicken_teriyaki_sandwich.sauces.add(bbq_sauce)
>>> chicken_teriyaki_sandwich.sauces.add(mayo_sauce)
>>>
>>> chicken_teriyaki_sandwich.sauces.all()
<QuerySet [<Sauce: Barbeque>, <Sauce: Mayonnaise>]>
>>>
Running sandwich.sauces.all() gives us all the sauces applied on that Sandwich but if we want to perform a reverse action, i.e get all the sandwiches that use a particular sauce, this can be done by performing the same operation on the target object by using a _set.
>>> bbq_sauce = Sauce.objects.get(name="Barbeque sauce")
>>>
>>> bbq_sauce.sandwich.all()
Traceback (most recent call last):
File "<console>", line 1, in <module>
AttributeError: 'Sauce' object has no attribute 'sandwich'
>>>
>>>
>>> bbq_sauce.sandwich_set.all()
<QuerySet [<Sandwich: Chicken Teriyaki>]>
>>>
As you can see, trying to perform a bbq_sauce.sandwich.all() threw a AttributeError but bbq_sauce.sandwich_set.all() worked. This is because Django internally references a target ManyToManyField as a set. This behaviour can be overridden by providing a related_name option to the target field.
class Sandwich(models.Model):
name = models.CharField(max_length=100)
sauces = models.ManyToManyField(Sauce, related_name="sandwiches")
def __str__(self):
return self.name
Now we can perform the previous query using sandwiches instead of sandwiches_set.
>>>
>>>
>>> bbq_sauce = Sauce.objects.get(name="Barbeque sauce")
>>> bbq_sauce.sandwiches.all()
<QuerySet [<Sandwich: Chicken Teriyaki>]>
>>>
>>>
In simpler words, a related name is the phrase that you would like to use instead of a *_set.
The recommended way to choose a related name is to use the plural form of your model name in lowercase.
Fetching specific sandwiches and sauces using each other
In the above examples, we were fetching all entries in the table using a .all() function, but in most practical cases, we would want to query a subset of the data. For instance we might want to query all the sandwiches that use barbeque sauce. This can be done with a query like this:
>>>
>>> Sandwich.objects.filter(sauces__name="Barbeque sauce")
<QuerySet [<Sandwich: Chicken Teriyaki>]>
>>>
>>>
But like I mentioned before, to perform this query Django internally has to join all the 3 tables. We can make this more efficient by querying using the sauce ID instead of name. This will enable Django to join only the Sandwich table and the through table.
>>>
>>> Sandwich.objects.filter(sauces__id=1)
<QuerySet [<Sandwich: Chicken Teriyaki>]>
>>>
>>>
>>>
You can also query this information in reverse, i.e fetch all sauces that are put on a particular sandwich.
>>>
>>> Sauce.objects.filter(sandwich__name="Chicken Teriyaki")
<QuerySet [<Sauce: Barbeque sauce>, <Sauce: Mayonnaise sauce>]>
>>>
>>>
>>>
Even in this case I would recommend querying using the sandwich ID to make this query more efficient.
Adding Items from either side of the relationship
So far we have been adding sauces to the sandwich model but we can also go the other way round pretty easily. Django internally takes care of whatever database table entries need to be created to make this happen.
The only gotcha is that if you don’t plan to use a related_name, you would have to add the item to a *_set attribute.
>>>
>>>
>>> sandwich = Sandwich.objects.get(name="Turkey")
>>>
>>> mayo_sauce = Sauce.objects.get(name="Mayonnaise sauce")
>>>
>>> mayo_sauce.sandwich_set.add(sandwich)
>>>
>>>
Using a custom “through” model
Even though Django takes care of creating the through model on its own and keeps this invisible to a user, sometimes it becomes necessary to use a custom through model in order to add some additional fields to that model.
For instance consider the relationship between a student and a teacher. A teacher can teach multiple students and a student can be taught by multiple teachers thereby qualifying this for a many to many relationship.
However in this case just having a table that connects these two entities won’t suffice because we would require extra information such as:
- The date on which a teacher started teaching a student.
- The subject that is taught by a teacher to a student.
- Duration of the course.
To sum this up, we require a “course” table that not only connects a student and a teacher but also holds this extra information.
To make this happen, one must override the default though table that Django creates and use a custom through table instead.
Extra sauce please!
Ok enough talk about students and teachers, let’s get back into the topic that probably interested you in this blog post in the first place - food!
In all of the above examples of Sandwiches and sauces, the only information we have is what sauces go on what sandwiches but what if someone wants to put extra sauce of the same kind on a sandwich?
You could try adding the same sauce model to a sandwich model multiple times but Django would simply ignore it as the add function is idempotent. You can add a particular sauce to a Sandwich only once. To solve this problem we can use a custom through model.
Note: Do keep in mind that if you want to go down this road you must do this from the start or be okay with dropping your database and starting from scratch because Django does not allow you to create a custom through model after previously using a default through model. If you try this you may see weird errors like this one:
raise ValueError(
ValueError: Cannot alter field sandwiches.Sandwich.sauces into sandwiches.Sandwich.sauces - they are not compatible types (you cannot alter to or from M2M fields, or add or remove through= on M2M fields)
Creating a custom through model:
from django.db import models
class Sauce(models.Model):
name = models.CharField(max_length=100)
def __str__(self):
return self.name
class Sandwich(models.Model):
name = models.CharField(max_length=100)
sauces = models.ManyToManyField(Sauce, through='SauceQuantity')
def __str__(self):
return self.name
class SauceQuantity(models.Model):
sauce = models.ForeignKey(Sauce, on_delete=models.CASCADE)
sandwich = models.ForeignKey(Sandwich, on_delete=models.CASCADE)
extra_sauce = models.BooleanField(default=False)
def __str__(self):
return "{}_{}".format(self.sandwich.__str__(), self.sauce.__str__())
With a custom through model you will not be able to add sauces to a Sandwich like you did before. Instead you would have to create entries of the SauceQuantity model explicitly as shown below.
>>> from sandwiches.models import *
>>>
>>>
>>> chicken_teriyaki_sandwich = Sandwich.objects.create(name="Chicken Teriyaki with mayo and extra bbq sauce")
>>>
>>>
>>> bbq_sauce = Sauce.objects.create(name="Barbeque")
>>>
>>> SauceQuantity.objects.create(sandwich=chicken_teriyaki_sandwich, sauce=bbq_sauce, extra_sauce=True)
<SauceQuantity: Chicken Teriyaki with mayo and extra bbq sauce_Barbeque>
>>>
>>> SauceQuantity.objects.create(sandwich=chicken_teriyaki_sandwich, sauce=mayo_sauce, extra_sauce=False)
<SauceQuantity: Chicken Teriyaki with mayo and extra bbq sauce_Mayonaisse>
>>>
You can still access a sauce from a sandwich and a sandwich from a sauce just like you previously did.
>>>
>>> chicken_teriyaki_sandwich.sauces.all()
<QuerySet [<Sauce: Barbeque>, <Sauce: Mayonnaise>]>
>>>
>>> bbq_sauce.sandwich_set.all()
<QuerySet [<Sandwich: Chicken Teriyaki with mayo and extra bbq sauce>]>
>>>
>>>
In order to know what all sauces are being used on a sandwich and in what quantities, we can iterate through the sauces of a Sandwich and retrieve information from the SauceQuantity model for each of the sauces as shown below.
>>>
>>>
>>> for sauce in chicken_teriyaki_sandwich.sauces.all():
... saucequantity = SauceQuantity.objects.get(sauce=sauce, sandwich=chicken_teriyaki_sandwich)
... print("{}{}".format("Extra " if saucequantity.extra_sauce else "", sauce))
...
Extra Barbeque
Mayonnaise
>>>
The SauceQuantity model can also be extended further to include stuff like whether or not the sandwich is cut in half, type of bread used, etc.
Closing notes
The ManyToManyField may be confusing but it is very handy. Any type of confusion you may have can be resolved with the right documentation. Here are a few that really helped me.
Top comments (1)
Appreciate this! The prof of one of the online courses I took liked to use the 'through' table, so I was wondering about the difference a 'through' table makes, vs a plain foreign key. So this helps. Thanks!