DEV Community

Mehdi Pourfar
Mehdi Pourfar

Posted on

9 1

Faster CSV export with Django & Postgres

As a Django developer, there are many times that you are asked for CSV export of your database tables.

To do this, Python’s CSV module is a simple library to use. But when your dataset is large, it becomes so inefficient that often leads to timeout error. There are two problem with this method:

  1. Python is not very fast.
  2. Generating model objects by Django Orm is very resource consuming.

As an alternative approach, I suggest using Postgres CSV functions. This way, not only we don’t need to create thousands of model objects, but also we rapidly generate CSV thanks to Postgres speed.

But do I need to write complex SQL queries?
What should I do if I currently have a filtered queryset, for example, in Django admin?

Don’t worry. There is no need to write SQL. Here, we write a function that accepts a queryset and a filename and returns a CSV response

from django.db import connection
from django.utils import timezone
from django.http import HttpResponse

def qs_to_csv_response(qs, filename):
    sql, params = qs.query.sql_with_params()
    sql = f"COPY ({sql}) TO STDOUT WITH (FORMAT CSV, HEADER, DELIMITER E'\t')"
    filename = f'{filename}-{timezone.now():%Y-%m-%d_%H-%M-%S}.csv'
    response = HttpResponse(content_type='text/csv')
    response['Content-Disposition'] = f'attachment; filename={filename}'
    with connection.cursor() as cur:
        sql = cur.mogrify(sql, params)
        cur.copy_expert(sql, response)
    return response

Suppose we have these models:

from django.db import models
from django.utils.translation import ugettext_lazy as _

class City(models.Model):
    name = models.CharField(max_length=50)

class Place(models.Model):
    PLACE_TYPE_CHOICES = (
        (1, _('Park')),
        (2, _('Cafe')),
        (3, _('Resturant')),
        (4, _('Cinema')),
    )
    name = models.CharField(max_length=50)
    city = models.ForeignKey(City, on_delete=models.CASCADE)
    place_type = models.PositiveSmallIntegerField(
        choices=PLACE_TYPE_CHOICES
    )

If we want to have a CSV export from Place table containing place_id, place_name, place_type and city_name, we pass this query to the function above:

from django.db.models import F

Place.objects.values(
  'id',
  'name',
  'place_type',
  city_name=F('city__name')
)

There is some caveat here. Postgres doesn’t know anything about our pretty, human readable and translated place types and putting some numbers in place_type columns can be completely useless.

The function below can be used to tell Postgres about this kind of mappings using Case and When:

from django.db.models import Case, When, Value, CharField

def map_choices(field_name, choices):
    return Case(
        *[When(**{field_name: value, 'then': Value(str(representation))})
          for value, representation in choices],
        output_field=CharField()
    )

And then, we use it in the query:

Place.objects.values(
  'id',
  'name',
  verbose_type=map_choices('place_type', Place.PLACE_TYPE_CHOICES),
  city_name=F('city__name')
)

By using Django ORM tools like F, Func, ExpressionWrapper, RawSQL … you can easily write more complex queries in a performant way.

Image of Docusign

🛠️ Bring your solution into Docusign. Reach over 1.6M customers.

Docusign is now extensible. Overcome challenges with disconnected products and inaccessible data by bringing your solutions into Docusign and publishing to 1.6M customers in the App Center.

Learn more

Top comments (1)

Collapse
 
foarsitter profile image
Jelmer

Neat solution! This made my day!

Image of Datadog

Master Mobile Monitoring for iOS Apps

Monitor your app’s health with real-time insights into crash-free rates, start times, and more. Optimize performance and prevent user churn by addressing critical issues like app hangs, and ANRs. Learn how to keep your iOS app running smoothly across all devices by downloading this eBook.

Get The eBook