Sagar Dutta

Posted on Jun 16, 2023

Django Concepts

#webdev #python #django #programming

What is secret key?

The secret key in django settings.py is a random string that is used for making hashes. It is important to keep the secret key secret and not expose it to others, as it can compromise the security of the Django application. Some of the things that depend on the secret key are:

Password reset tokens
CSRF protection
Sessions
Cookie-based messages

What are the default Django apps inside it? Are there more?

Django provides some pre-installed apps for users for the developer’s comfort, such as:

admin: A site for managing the Django project
auth: A framework for authentication and authorization
contenttypes: A framework for tracking the models installed in the project
sessions: A framework for managing session data
messages: A framework for displaying messages to users
staticfiles: A framework for managing static files

If we don't need any of them, we can comment-out or delete the appropriate lines from INSTALLED_APPS before running migrate. The default apps provide some useful features and functionality that we can use in our project, such as:

The admin site allows us to create, update and delete objects in our database using a user-friendly interface.
The auth system handles user authentication, permissions and groups.
The contenttypes framework tracks the models installed in the project and allows us to associate data with specific models.
The sessions framework enables us to store and retrieve data on a per-site-visitor basis.
The messages framework enables us to display one-time notifications and alerts to users.
The staticfiles framework helps us manage the static files, such as images, CSS and JavaScript.

What is middleware? What are different kinds of middleware?

Middleware is a framework of hooks into Django’s request/response processing. It’s a light, low-level “plugin” system for globally altering Django’s input or output. Each middleware component is responsible for doing some specific function. Middleware is used to perform a function in the application, such as security, session, csrf protection, authentication, etc. Django provides various built-in middleware and also allows us to write our own middleware.

Middleware in Django can be written as a function or a class that takes a get_response callable and returns a middleware. A middleware is a callable that takes a request and returns a response, just like a view. A middleware can also perform some actions before or after the view is called.

There are different ways to classify middleware in Django. One way is to distinguish between request middleware and response middleware.

Request middleware is middleware that is executed before the view function is called. It allows developers to modify the request object or perform some other action before the request is passed to the view function.
Response middleware is middleware that is executed after the view function is called. It allows developers to modify the response object or perform some other action before the response is returned to the client.

Another way is to distinguish between built-in middleware and custom middleware.

Built-in middleware is middleware that comes with Django by default and provides some common functionality, such as caching, authentication, csrf protection, etc.
Custom middleware is middleware that developers can create and use across their entire project or specific views. Custom middleware can be written as a function or a class that takes a get_response callable and returns a middleware.

CSRF

Cross site request forgery (CSRF) is a type of attack that allows a malicious user to execute actions using the credentials of another user without that user's knowledge or consent. For example, an attacker can create a link or a form that submits data to a web application where the user is already logged in, and perform actions on behalf of the user.

Django has built-in protection against most types of CSRF attacks, providing we have enabled and used it where appropriate. Django's CSRF protection works by using the following components:

A CSRF cookie that is a random secret value, which other sites will not have access to. CsrfViewMiddleware sends this cookie with the response whenever django.middleware.csrf.get_token() is called. It can also send it in other cases. For security reasons, the value of the secret is changed each time a user logs in.
A hidden form field with the name 'csrfmiddlewaretoken', present in all outgoing POST forms. In order to protect against BREACH attacks, the value of this field is not simply the secret. It is scrambled differently with each response using a mask. The mask is generated randomly on every call to get_token(), so the form field value is different each time. This part is done by the csrf_token template tag.
A validation mechanism that checks for a CSRF cookie and a 'csrfmiddlewaretoken' field in all incoming requests that are not using HTTP GET, HEAD, OPTIONS or TRACE. If they are not present or correct, the user will get a 403 error. When validating the 'csrfmiddlewaretoken' field value, only the secret, not the full token, is compared with the secret in the cookie value. This allows the use of ever-changing tokens. This check is done by CsrfViewMiddleware.
An origin and referer header verification mechanism that checks for a valid origin header, if provided by the browser, against the current host and the CSRF_TRUSTED_ORIGINS setting. This provides protection against cross-subdomain attacks. In addition, for HTTPS requests, if the origin header isn't provided, CsrfViewMiddleware performs strict referer checking. This means that even if a subdomain can set or modify cookies on your domain, it can't force a user to post to your application since that request won't come from your own exact domain¹.

To use Django's CSRF protection in our views, we need to follow these steps:

Make sure that django.middleware.csrf.CsrfViewMiddleware is activated in the MIDDLEWARE setting. If we override that setting, remember that this middleware should come before any view middleware that assume that CSRF attacks have been dealt with.
In any template that uses a POST form, use the csrf_token tag inside the
element if the form is for an internal URL. For example:

  <form method="post">{% csrf_token %}
  ...
  </form>

In the corresponding view functions, ensure that RequestContext is used to render the response so that {% csrf_token %} will work properly. If you're using the render() function, generic views, or contrib apps, you are covered already since these all use RequestContext.
If you are using AJAX requests, you need to set a custom X-CSRFToken header (as specified by the CSRF_HEADER_NAME setting) to the value of the CSRF token on each XMLHttpRequest. You can get the token from the csrftoken cookie or from a hidden input field rendered by {% csrf_token %}. You can also use JavaScript frameworks that provide hooks for setting headers on every request.

XSS

Cross site scripting (XSS) is a type of vulnerability that involves manipulating user interaction with a web application to compromise a user's browser environment. These vulnerabilities can affect many web apps, including those built with modern frameworks. XSS attacks happen through injections of scripts that contain HTML tags. For example, an attacker can inject a script that displays an alert or steals a user's cookies.

Django has built-in protection against most types of XSS attacks, providing we have enabled and used it where appropriate. Django templates escape specific characters which are particularly dangerous to HTML, such as <, >, ', ", and &. This prevents the browser from interpreting them as HTML tags and executing them as scripts. However, this protection iClickjacking protections not entirely foolproof, and there are some cases where we need to be extra careful, such as:

When using template literals in JavaScript, which can bypass the escaping mechanism.
When using attribute URLs, such as href or src, which can contain malicious code.
When using JavaScript or CSS data, which can contain expressions that execute scripts.
When using the safe template tag, mark_safe function, or is_safe attribute with custom template tags, which disable the escaping mechanism.
When using the safeseq filter, which disables the escaping for a sequence of variables.
Clickjacking protection
When using html_safe() method, which marks a string as safe for HTML output.
When outputting something other than HTML, such as JSON or XML, which may require different escaping rules.

To prevent XSS attacks in Django, we should follow these best practices:

Quote dynamic data that comes from user input or untrusted sources.
Avoid template literals in JavaScript and use string concatenation instead.
Validate attribute URLs and ensure they start with http:// or https://.
Escape JavaScript data using json_script template tag.
Escape CSS data using css_escaping template filter.
Use safe, safeseq, mark_safe, is_safe, and html_safe() only when you are sure the data is safe and does not contain any HTML tags.
Use autoescape template tag or autoescape context variable to enable or disable escaping for a block of code.

Click Jacking

Clickjacking is a type of attack that tricks a user into clicking on a concealed element of another site which they have loaded in a hidden frame or iframe. For example, an attacker can create a button that overlays a legitimate site's button and perform actions on behalf of the user.

Django provides clickjacking protection in the form of X-Frame-Options middleware. This feature prevents a different site from rendering our site inside a frame. The middleware sets the X-Frame-Options HTTP header to DENY by default, which means that our site cannot be displayed in a frame on any site. We can change this value to SAMEORIGIN, which means that our site can be displayed in a frame on the same origin (including subdomain and port). We can also set the X_FRAME_OPTIONS setting to configure the exact header value sent.

To use Django's clickjacking protection for all responses, we need to activate django.middleware.clickjacking.XFrameOptionsMiddleware in the MIDDLEWARE setting. This middleware is enabled in the settings file generated by startproject.

To disable the clickjacking protection for certain views, we can use the xframe_options_exempt view decorator. For example:

from django.http import HttpResponse
from django.views.decorators.clickjacking import xframe_options_exempt

@xframe_options_exempt
def ok_to_load_in_a_frame(request):
    return HttpResponse("This page is safe to load in a frame on any site.")

We can also use other view decorators to set different values for the X-Frame-Options header, such as xframe_options_sameorigin or xframe_options_deny.

What is WSGI?

Web Server Gateway Interface (WSGI) is a simple calling convention for web servers to forward requests to web applications or frameworks written in the Python programming language. It is a standard interface that makes it possible for applications written with any Python web framework to run in any web server that supports WSGI. It was created in 2003 to promote common ground for portable web application development.

What is ondelete Cascade?

ON DELETE CASCADE is an option when defining a foreign key in SQL. It indicates that related rows should be deleted in the child table when rows are deleted in the parent table. It is a kind of referential action that prevents leaving orphan records or children without parents. For example, if you have two tables: buildings and rooms, where each building has one or many rooms, and each room belongs to one only one building, we can use ON DELETE CASCADE on the foreign key of the rooms table that references the buildings table. This way, when we delete a building, all rooms in that building will be also deleted automatically.

Model Field

A model field is an attribute of a model that defines the type and constraints of the data that can be stored in a database column. A model field can have various options, such as null, blank, default, choices, etc. A model field can also have one or more validators that check the validity of the input data.

A model field is a class attribute of a model that represents a column in a database table. A model field defines the data type of the column, such as CharField, IntegerField, BooleanField, etc. A model field can also have various options that affect how the data is stored and displayed, such as null, blank, default, choices, etc. For example, a null option indicates whether the column can have NULL values or not. A choices option provides a list of tuples to use as choices for the field.

A model field is an instance of a Field class that subclasses django.db.models.Field. Django provides many built-in field types, such as AutoField, BigAutoField, BigIntegerField, BinaryField, etc. We can also create our own custom fields by subclassing Field and implementing the required methods.

A model field is used to map a Python object attribute to a database column. A model field also provides methods and properties to access and manipulate the data associated with the field. For example, a model field has a name property that returns the name of the field, a value_from_object method that returns the value of the field from an object instance, and a get_db_prep_value method that prepares the value for database insertion.

Validators

A validator is a callable object or function that takes a value and raises a ValidationError if it doesn't meet some criteria. Validators can be used in Django at the model level or the level of the form. They can be useful for reusing validation logic between different types of fields.

For example, we can write a validator that only allows even numbers:

from django.core.exceptions import ValidationError
from django.utils.translation import gettext_lazy as _

def validate_even(value):
    if value % 2 != 0:
        raise ValidationError(
            _("%(value)s is not an even number"),
            params={"value": value},
        )

We can add this validator to a model field or a form field using the validators argument:

from django.db import models

class MyModel(models.Model):
    even_field = models.IntegerField(validators=[validate_even])

from django import forms

class MyForm(forms.Form):
    even_field = forms.IntegerField(validators=[validate_even])

We can also use a class with a __call__() method for more complex or configurable validators. For example, RegexValidator is a class-based validator that searches the value for a given regular expression.

Django also provides some built-in validators in the django.core.validators module, such as EmailValidator, MinValueValidator, URLValidator, etc.

Understanding the difference between `Python module` and `Python class`?

A Python module is a file containing Python definitions and statements that can be imported into another Python program. A module can contain classes, functions, variables, and other code elements. A module helps to organize the code into logical units and reuse them in different program.

A Python class is a blueprint for creating objects that have attributes and behaviors. A class can inherit from other classes, use metaclasses and descriptors, and create instances. A class is a way to abstract the data and methods that belong to a specific type of objects.

Some differences are:

A module cannot be instantiated, but a class can.
A module is a singleton instance of an internal module class, and its globals are its attributes. A class can have multiple instances, each with its own state and identity.
A module can contain multiple classes, but a class cannot contain multiple modules.
A module is identified by its file name, but a class is identified by its class name.
A module can be executed as a script, but a class cannot be executed directly.

Using ORM queries in Django Shell

ORM queries in Django Shell are a way of interacting with data from various relational databases using an API called ORM (Object Relational Mapping). ORM allows us to write object-oriented code that is translated into raw SQL queries behind the scenes.

To use ORM queries in Django Shell, we need to follow these steps:

Importing the models we want to query from our app. For example, if we have a model called Student in an app called sampleapp, we can import it as:
```
from sampleapp.models import Student
```
Use the model's objects attribute to access the ORM methods, such as all(), create(), get(), filter(), order_by(), etc. For example, if we want to get all the students from the database, we can use:
```
queryset = Student.objects.all()
```
We can print the queryset object to see the results, or use the str() function to see the corresponding SQL query. For example:
```
print(queryset)
str(queryset.query)
```
We can also use dot notation to access the fields of each object in the queryset. For example, if we want to print the name and email of each student, we can use:
```
for student in queryset:
    print(student.name, student.email)
```
We can modify or delete objects using the save() or delete() methods on them. For example, if we want to change the name of a student with id 1, we can use:
```
student = Student.objects.get(id=1)
student.name = "New name"
student.save()
```
We can also perform complex queries using Q objects, F expressions, annotations, aggregations, etc. For example, if we want to get the students whose name starts with 'A' or whose email contains 'gmail', we can use:
```
from django.db.models import Q
queryset = Student.objects.filter(Q(name__startswith='A') | Q(email__contains='gmail'))
```

Turning ORM to SQL in Django Shell

Turning ORM to SQL in Django Shell is a way of seeing the raw SQL query that Django generates based on the ORM query. This can be useful for debugging, optimizing, or learning purposes.

To turn ORM to SQL in Django Shell, we can use the query method of a QuerySet object. This method returns the raw SQL query that Django executes behind the scenes.

For example, if we have a model called Student and we want to see the SQL query for getting all the students from the database, we can do the following:

Import the model from the app. For example:
```
from sampleapp.models import Student
```
Use the model's objects attribute to access the ORM methods, such as all(), and assign it to a variable. For example:
```
queryset = Student.objects.all()
```

Use the print() function or the str() function to see the raw SQL query. For example:

print(queryset.query)
str(queryset.query)

This will print something like:

SELECT "sampleapp_student"."id", "sampleapp_student"."name", "sampleapp_student"."email" FROM "sampleapp_student"

We can use this technique for any ORM query, such as filter(), order_by(), annotate(), etc.

What are Aggregations?

Aggregations are a way of summarizing or computing values from a collection of objects in Django ORM. For example, we can use aggregations to find the average price of all books, the maximum rating of all books, the total number of books per publisher, etc.

To perform aggregations, we can use two methods: aggregate() and annotate(). The aggregate() method returns a dictionary of name-value pairs, where the name is an identifier for the aggregate value and the value is the computed aggregate. For example:

# Average price across all books.
>>> from django.db.models import Avg
>>> Book.objects.aggregate(Avg("price"))
{'price__avg': 34.35}

We can use various aggregate functions, such as Count, Sum, Avg, Min, Max, etc. We can also use field lookups to filter or group by related fields using the double underscore notation. For example:

# Each publisher, with a separate count of books with a rating above and below 5
>>> from django.db.models import Q
>>> above_5 = Count("book", filter=Q(book__rating__gt=5))
>>> below_5 = Count("book", filter=Q(book__rating__lte=5))
>>> pubs = Publisher.objects.annotate(below_5=below_5).annotate(above_5=above_5)
>>> pubs[0].above_5
23
>>> pubs[0].below_5
12

What are Annotations?

Annotations are a way of adding extra attributes to each object in a QuerySet that are derived from some computation or aggregation. For example, we can use annotations to add a count of books for each publisher, a total price of all books in a store, a flag indicating if a book has a rating above a certain threshold, etc.

To perform annotations, we can use the annotate() method on a QuerySet and pass an expression that defines the annotation. The expression can be a simple value, a field reference, an aggregate function, a function, a lookup, or any combination of these using arithmetic or boolean operators. For example:

# Each publisher, each with a count of books as a "num_books" attribute.
>>> from django.db.models import Count
>>> pubs = Publisher.objects.annotate(num_books=Count("book"))
>>> pubs
<QuerySet [<Publisher: BaloneyPress>, <Publisher: SalamiPress>, ...]>
>>> pubs[0].num_books
73

# Each book, with an extra attribute indicating if it has more than 300 pages.
>>> from django.db.models import BooleanField, Case, When
>>> books = Book.objects.annotate(
...     is_long=Case(
...         When(pages__gt=300, then=True),
...         default=False,
...         output_field=BooleanField(),
...     )
... )
>>> books
<QuerySet [<Book: War and Peace>, <Book: The Catcher in the Rye>, ...]>
>>> books[0].is_long
True

We can use annotations in filters, order by clauses, or other annotations. We can also chain multiple annotations together to add more than one extra attribute. For example:

# Each book, with an extra attribute indicating its price after applying a 10% discount,
# and another attribute indicating if it is cheap (less than $20).
>>> from django.db.models import DecimalField, ExpressionWrapper, F
>>> books = Book.objects.annotate(
...     discounted_price=ExpressionWrapper(
...         F("price") * 0.9,
...         output_field=DecimalField(max_digits=10, decimal_places=2),
...     )
... ).annotate(
...     is_cheap=Case(
...         When(discounted_price__lt=20, then=True),
...         default=False,
...         output_field=BooleanField(),
...     )
... )
>>> books
<QuerySet [<Book: War and Peace>, <Book: The Catcher in the Rye>, ...]>
>>> books[0].discounted_price
Decimal('18.00')
>>> books[0].is_cheap
True

What is a migration file? Why is it needed?

A migration file is a Python file that describes the changes to the database schema that are required by the models. For example, if we add a new field to a model, Django will create a migration file that instructs the database to add a new column to the corresponding table.

Migration files are needed because they allow to keep track of the evolution of the database schema over time, and to apply the same changes consistently across different environments (such as development, staging, and production). They also enable us to roll back to a previous state of the database if something goes wrong.

Migration files are usually generated automatically by Django when we run the makemigrations command, which detects any changes we have made to the models since the last migration. We can also create empty migration files manually by using the --empty option with the makemigrations command, and then write our own custom code inside them.

Migration files are stored in a migrations folder inside each app that has models. The migration files have names that start with a number (such as 0001_initial.py) that indicate the order in which they should be applied. We can apply the migrations to the database by running the migrate command, which will execute the SQL statements generated by Django based on the migration files.

What are SQL transactions? (non ORM concept)

A SQL transaction is a grouping of one or more SQL statements that interact with a database. A transaction in its entirety can commit to a database as a single logical unit or rollback (become undone) as a single logical unit. In SQL, transactions are essential for maintaining database integrity. They are used to preserve integrity when multiple related operations are executed concurrently, or when multiple users interact with a database concurrently.

A database application has to account for every possible failure scenario while writing and reading from a database. Without SQL transactions, application code that protects database integrity would be complex and expensive to develop and maintain. With SQL transactions, application code and database maintenance can be simplified.

What are atomic transactions?

Atomic Transactions are associated with Database operations where a set of actions must ALL complete or else NONE of them complete. For example, if someone is booking a flight, you want to both get payment AND reserve the seat OR do neither. If either one were allowed to succeed without the other also succeeding, the database would be inconsistent.