It sucks when you're working on a Django app and all your pages are empty. For example, if you're working on a forum webapp, then all your discussion boards will be empty by default:
Manually creating enough data for your pages to look realistic is a lot of work. Wouldn't it be nice if there was an automatic way to populate your local database with dummy data
that looks real? Eg. your forum app has many threads:
Even better, wouldn't it be cool if there was an easy way to populate each thread with as many comments
as you like?
In this post I'll show you how to use Factory Boy and a few other tricks to quickly and repeatably generate an endless amount of dummy data for your Django app. By the end of the post you'll be able to generate all your test data using a management command:
There is example code for this blog post hosted in this GitHub repo.
In this post we'll be working with an example app that is an online forum. There are four models that we'll be working with:
# models.py class User(models.Model): """A person who uses the website""" name = models.CharField(max_length=128) class Thread(models.Model): """A forum comment thread""" title = models.CharField(max_length=128) creator = models.ForeignKey(User) class Comment(models.Model): """A comment by a user on a thread""" body = models.CharField(max_length=128) poster = models.ForeignKey(User) thread = models.ForeignKey(Thread) class Club(models.Model): """A group of users interested in the same thing""" name = models.CharField(max_length=128) member = models.ManyToManyField(User)
We'll be using Factory Boy to generate all our dummy data. It's a library that's built for automated testing, but it also works well for this use-case. Factory Boy can easily be configured to generate random but realistic data like names, emails and paragraphs by internally using the Faker library.
When using Factory Boy you create classes called "factories", which each represent a Django model. For example, for a user, you would create a factory class as follows:
# factories.py import factory from factory.django import DjangoModelFactory from .models import User # Defining a factory class UserFactory(DjangoModelFactory): class Meta: model = User name = factory.Faker("first_name") # Using a factory with auto-generated data u = UserFactory() u.name # Kimberly u.id # 51 # You can optionally pass in your own data u = UserFactory(name="Alice") u.name # Alice u.id # 52
Another benefit of Factory boy is that it can be set up to generate related data using SubFactory, saving you a lot of boilerplate and time. For example we can set up the
ThreadFactory so that it generates a
User as its creator automatically:
# factories.py class ThreadFactory(DjangoModelFactory): class Meta: model = Thread creator = factory.SubFactory(UserFactory) title = factory.Faker( "sentence", nb_words=5, variable_nb_words=True ) # Create a new thread t = ThreadFactory() t.title # Room marriage study t.creator # <User: Michelle> t.creator.name # Michelle
The ability to automatically generate related models and fake data makes Factory Boy quite powerful. It's worth taking a quick look at the other suggested patterns if you decide to try it out.
Once you've defined all the models that you want to generate with Factory Boy, you can write a management command to automatically populate your database. This is a pretty crude script that doesn't take advantage of all of Factory Boy's features, like sub-factories, but I didn't want to spend too much time getting fancy:
# setup_test_data.py import random from django.db import transaction from django.core.management.base import BaseCommand from forum.models import User, Thread, Club, Comment from forum.factories import ( UserFactory, ThreadFactory, ClubFactory, CommentFactory ) NUM_USERS = 50 NUM_CLUBS = 10 NUM_THREADS = 12 COMMENTS_PER_THREAD = 25 USERS_PER_CLUB = 8 class Command(BaseCommand): help = "Generates test data" @transaction.atomic def handle(self, *args, **kwargs): self.stdout.write("Deleting old data...") models = [User, Thread, Comment, Club] for m in models: m.objects.all().delete() self.stdout.write("Creating new data...") # Create all the users people =  for _ in range(NUM_USERS): person = UserFactory() people.append(person) # Add some users to clubs for _ in range(NUM_CLUBS): club = ClubFactory() members = random.choices( people, k=USERS_PER_CLUB ) club.user.add(*members) # Create all the threads for _ in range(NUM_THREADS): creator = random.choice(people) thread = ThreadFactory(creator=creator) # Create comments for each thread for _ in range(COMMENTS_PER_THREAD): commentor = random.choice(people) CommentFactory( user=commentor, thread=thread )
transaction.atomic decorator makes a big difference in the runtime of this script, since it bundles up 100s of queries and submits them in one go.
If you need dummy images for your website as well then there are a lot of great free tools online to help. I use adorable.io for dummy profile pics and Picsum or Unsplash for larger pictures like this one: https://picsum.photos/700/500.
Hopefully this post helps you spin up a lot of fake data for your Django app very quickly. If you enjoy using Factory Boy to generate your dummy data, then you also might like incorporating it into your unit tests.