<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Aaron Harris</title>
    <description>The latest articles on DEV Community by Aaron Harris (@alphaharris).</description>
    <link>https://dev.to/alphaharris</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F158020%2Fa1ed08a7-c6e4-4920-8a27-70e9a12a6c73.jpg</url>
      <title>DEV Community: Aaron Harris</title>
      <link>https://dev.to/alphaharris</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/alphaharris"/>
    <language>en</language>
    <item>
      <title>Django Database Migrations: A Comprehensive Overview</title>
      <dc:creator>Aaron Harris</dc:creator>
      <pubDate>Wed, 23 Oct 2019 19:30:55 +0000</pubDate>
      <link>https://dev.to/kite/django-database-migrations-a-comprehensive-overview-5dk8</link>
      <guid>https://dev.to/kite/django-database-migrations-a-comprehensive-overview-5dk8</guid>
      <description>&lt;p&gt;Django Database Migrations: A Comprehensive Overview&lt;br&gt;
&lt;em&gt;by Damian Hites&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The Django web framework is designed to work with an SQL-based relational database backend, most commonly  PostgreSQL or MySQL. If you’ve never worked directly with a relational database before, managing how your data is stored/accessed and keeping it consistent with your application code is an important skill to master. &lt;/p&gt;

&lt;p&gt;You’ll need a contract between your database schema (how your data is laid out in your database) and your application code, so that when your application tries to access data, the data is where your application expects it to be. Django provides an abstraction for managing this contract in its ORM (Object-Relational Mapping). &lt;/p&gt;

&lt;p&gt;Over your application’s lifetime, it’s very likely that your data needs will change. When this happens, your database schema will probably need to change as well. Effectively, your contract (in Django’s case, your Models) will need to change to reflect the new agreement, and before you can run the application, the database will need to be migrated to the new schema. &lt;/p&gt;

&lt;p&gt;Django’s ORM comes with a system for managing these migrations to simplify the process of keeping your application code and your database schema in sync.&lt;/p&gt;
&lt;h2&gt;
  
  
  Django’s database migration solution
&lt;/h2&gt;

&lt;p&gt;Django’s migration tool simplifies the manual nature of the migration process described above while taking care of tracking your migrations and the state of your database. Let’s take a look at the three-step migration process with Django’s migration tool.&lt;/p&gt;
&lt;h3&gt;
  
  
  1. Change the contract: Django’s ORM
&lt;/h3&gt;

&lt;p&gt;In Django, the contract between your database schema and your application code is defined using the Django ORM. You define a data model using Django ORM’s models and your application code interfaces with that data model. &lt;/p&gt;

&lt;p&gt;When you need to add data to the database or change the way the data is structured, you simply create a new model or modify an existing model in some way. Then you can make the required changes to your application code and update your unit tests, which should verify your new contract (if given enough testing coverage).&lt;/p&gt;
&lt;h3&gt;
  
  
  2. Plan for change: generate migrations
&lt;/h3&gt;

&lt;p&gt;Django maintains the contract largely through its migration tool. Once you make changes to your models, Django has a simple command that will detect those changes and generate migration files for you.&lt;/p&gt;
&lt;h3&gt;
  
  
  3. Execute: apply migrations
&lt;/h3&gt;

&lt;p&gt;Finally, Django has another simple command that will apply any unapplied migrations to the database. Run this command any time you are deploying your code to the production environment. Ideally, you’ll have deploy scripts that would run the migration command right before pushing your new code live.&lt;/p&gt;
&lt;h3&gt;
  
  
  Tracking changes with Django
&lt;/h3&gt;

&lt;p&gt;Django takes care of tracking migrations for you. Each generated migration file has a unique name that serves as an identifier. When a migration is applied, Django maintains a database table for tracking applied migrations to make sure that only unapplied migrations are run. &lt;/p&gt;

&lt;p&gt;The migration files that Django generates should be included in the same commit with their corresponding application code so that it’s never out-of-sync with your database schema.&lt;/p&gt;
&lt;h3&gt;
  
  
  Rolling back with Django
&lt;/h3&gt;

&lt;p&gt;Django has the ability to rollback to a previous migration. The auto-generated operations feature built-in support for reversing an operation. In the case of a custom operation, it’s on you to make sure the operation can be reversed to ensure that this functionality is always available.&lt;/p&gt;
&lt;h2&gt;
  
  
  A simple Django database migrations example
&lt;/h2&gt;

&lt;p&gt;Now that we have a basic understanding of how migrations are handled in Django, let’s look at a simple example of migrating an application from one state to the next. Let’s assume we have a Django project for our blog and we want to make some changes. &lt;/p&gt;

&lt;p&gt;First, we want to allow for our posts to be edited before publishing to the blog. Second, we want to allow people to give feedback on each post, but we want to give them a curated list of options for that feedback. In anticipation of those options changing, we want to define them in our database rather than in the application code.&lt;/p&gt;
&lt;h3&gt;
  
  
  The initial Django application
&lt;/h3&gt;

&lt;p&gt;For the purposes of demonstration, we’ll setup a very basic Django project called &lt;code&gt;Foo&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;django-admin startproject foo
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Within that project, we’ll set up our blogging application. From inside the project’s base directory: &lt;code&gt;./manage.py startapp blog&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Register our new application with our project in &lt;code&gt;foo/settings.py&lt;/code&gt; by adding &lt;code&gt;blog&lt;/code&gt; to &lt;code&gt;INSTALLED_APPS&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;INSTALLED_APPS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;
    &lt;span class="s"&gt;'blog'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;In &lt;code&gt;blog/models.py&lt;/code&gt; we can define our initial data model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Model&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;slug&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SlugField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_length&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;unique&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CharField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_length&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TextField&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;In our simple application, the only model we have represents a blog post. It has a slug for uniquely identifying the post, a title, and the body of the post.&lt;/p&gt;

&lt;p&gt;Now that we have our initial data model defined, we can generate the migrations that will set up our database: &lt;code&gt;./manage.py makemigrations&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Notice that the output of this command indicates that a new migration file was created at&lt;/p&gt;

&lt;p&gt;&lt;code&gt;blog/migrations/0001_initial.py&lt;/code&gt; containing a command to &lt;code&gt;CreateModel name=‘Post’&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;If we open the migration file, it will look something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Generated by Django 2.2 on 2019-04-21 18:04
&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;django.db&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;migrations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Migration&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;migrations&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Migration&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;initial&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;

    &lt;span class="n"&gt;dependencies&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="n"&gt;operations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="n"&gt;migrations&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CreateModel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'Post'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;fields&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'id'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AutoField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                    &lt;span class="n"&gt;auto_created&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
                    &lt;span class="n"&gt;primary_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
                    &lt;span class="n"&gt;serialize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
                    &lt;span class="n"&gt;verbose_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'ID'&lt;/span&gt;
                &lt;span class="p"&gt;)),&lt;/span&gt;
                &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'slug'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SlugField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;unique&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
                &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'title'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CharField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_length&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
                &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'body'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TextField&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt;
            &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Most of the migration’s contents are pretty easy to make sense of. This initial migration was auto-generated, has no dependencies, and has a single operation: create the &lt;code&gt;Post Model&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Now let’s set up an initial SQLite database with our data model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;./manage.py migrate
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;The default Django configuration uses SQLite3, so the above command generates a file called &lt;code&gt;db.sqlite3&lt;/code&gt; in your project’s root directory. Using the SQLite3 command line interface, you can inspect the contents of the database and of certain tables. &lt;/p&gt;

&lt;p&gt;To enter the SQLite3 command line tool run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;sqlite3&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sqlite3&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Once in the tool, list all tables generated by your initial migration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;sqlite&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tables&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Django comes with a number of initial models that will result in database tables, but the 2 that we care about right now are &lt;code&gt;blog_post&lt;/code&gt;, the table corresponding to our &lt;code&gt;Post Model&lt;/code&gt;, and &lt;code&gt;django_migrations&lt;/code&gt;, the table Django uses to track migrations. &lt;/p&gt;

&lt;p&gt;Still in the SQLite3 command line tool, you can print the contents of the &lt;code&gt;django_migrations&lt;/code&gt; table:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;sqlite&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;django_migrations&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;This will show all migrations that have run for your application. If you look through the list, you’ll find a record indicating that the &lt;code&gt;0001_initial migration&lt;/code&gt; was run for the blog application. This is how Django knows that your migration has been applied.&lt;/p&gt;

&lt;h3&gt;
  
  
  Changing the Django data model
&lt;/h3&gt;

&lt;p&gt;Now that the initial application is setup, let’s make changes to the data model. First, we’ll add a field called &lt;code&gt;published_on&lt;/code&gt; to our &lt;code&gt;Post Model&lt;/code&gt;. This field will be nullable. When we want to publish something, we can simply indicate when it was published. &lt;/p&gt;

&lt;p&gt;Our new &lt;code&gt;Post Model&lt;/code&gt; will now be:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;django.db&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Model&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;slug&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SlugField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_length&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;unique&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CharField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_length&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TextField&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;published_on&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DateTimeField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;null&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;blank&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Next, we want to add support for accepting feedback on our posts. We want 2 models here: one for tracking the options we display to people, and one for tracking the actual responses&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;django.conf&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;settings&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;django.db&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;FeedbackOption&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Model&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;slug&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SlugField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_length&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;unique&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;option&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CharField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_length&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;PostFeedback&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Model&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ForeignKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;settings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AUTH_USER_MODEL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;related_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'feedback'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;on_delete&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CASCADE&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;post&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ForeignKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="s"&gt;'Post'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;related_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'feedback'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;on_delete&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CASCADE&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;option&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ForeignKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="s"&gt;'FeedbackOption'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;related_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'feedback'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;on_delete&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CASCADE&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h3&gt;
  
  
  Generate the Django database migration
&lt;/h3&gt;

&lt;p&gt;With our model changes done, let’s generate our new migrations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;./manage.py makemigrations
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Notice that this time, the output indicates a new migration file, &lt;code&gt;blog/migrations/0002_auto_&amp;lt;YYYYMMDD&amp;gt;_&amp;lt;...&amp;gt;.py&lt;/code&gt;, with the following changes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create model &lt;code&gt;FeedbackOption&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Add field &lt;code&gt;published_on&lt;/code&gt; to &lt;code&gt;Post&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Create model &lt;code&gt;PostFeedback&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are the three changes that we introduced to our data model. &lt;/p&gt;

&lt;p&gt;Now, if we go ahead and open the generated file, it will look something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Generated by Django 2.2 on 2019-04-21 19:31
&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;django.conf&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;settings&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;django.db&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;migrations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;django.db.models.deletion&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Migration&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;migrations&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Migration&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;

    &lt;span class="n"&gt;dependencies&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="n"&gt;migrations&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;swappable_dependency&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;settings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AUTH_USER_MODEL&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'blog'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'0001_initial'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="n"&gt;operations&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="n"&gt;migrations&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CreateModel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'FeedbackOption'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;fields&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'id'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AutoField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                    &lt;span class="n"&gt;auto_created&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;primary_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;serialize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;verbose_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'ID'&lt;/span&gt;
                &lt;span class="p"&gt;)),&lt;/span&gt;
                &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'slug'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SlugField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;unique&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
                &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'option'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CharField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_length&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
            &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;migrations&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AddField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'post'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'published_on'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;field&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DateTimeField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;blank&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;null&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;migrations&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CreateModel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'PostFeedback'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;fields&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'id'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AutoField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                    &lt;span class="n"&gt;auto_created&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;primary_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;serialize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;verbose_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'ID'&lt;/span&gt;
                &lt;span class="p"&gt;)),&lt;/span&gt;
                &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'option'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ForeignKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                    &lt;span class="n"&gt;on_delete&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;django&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;deletion&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CASCADE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;related_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'feedback'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'blog.FeedbackOption'&lt;/span&gt;
                &lt;span class="p"&gt;)),&lt;/span&gt;
                &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'post'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ForeignKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                    &lt;span class="n"&gt;on_delete&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;django&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;deletion&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CASCADE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;related_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'feedback'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'blog.Post'&lt;/span&gt;
                &lt;span class="p"&gt;)),&lt;/span&gt;
                &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'user'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ForeignKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                    &lt;span class="n"&gt;on_delete&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;django&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;deletion&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CASCADE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;related_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'feedback'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;settings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AUTH_USER_MODEL&lt;/span&gt;
                &lt;span class="p"&gt;)),&lt;/span&gt;
            &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Similar to our first migration file, each operation maps to changes that we made to the data model. The main differences to note are the dependencies. Django has detected that our change relies on the first migration in the blog application and, since we depend on the auth user model, that is marked as a dependency as well.&lt;/p&gt;

&lt;h3&gt;
  
  
  Applying the Django database migration
&lt;/h3&gt;

&lt;p&gt;Now that we have our migrations generated, we can apply the migrations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;./manage.py migrate
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;The output tells us that the latest generated migration is applied. If we inspect our modified SQLite database, we’ll see that our new migration file should be in the &lt;code&gt;django_migrations&lt;/code&gt; table, the new tables should be present, and our new field on the &lt;code&gt;Post Model&lt;/code&gt; should be reflected in the &lt;code&gt;blog_post&lt;/code&gt; table.&lt;/p&gt;

&lt;p&gt;Now, if we were to deploy our changes to production, the application code and database  would be updated, and we would be running the new version of our application.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bonus: data migrations
&lt;/h3&gt;

&lt;p&gt;In this particular example, the &lt;code&gt;blog_feedbackoption&lt;/code&gt; table (generated by our migration) will be empty when we push our code change. If our interface has been updated to surface these options, there is a chance that we forget to populate these when we push. Even if we don’t forget, we have the same problem as before: new objects are created in the database while the new application code is deploying, so there is very little time for the interface to show a blank list of options. &lt;/p&gt;

&lt;p&gt;To help in scenarios where the required data is somewhat tied to the application code or to changes in the data model, Django provides utility for making data migrations. These are migration operations that simply change the data in the database rather than the table structure.&lt;/p&gt;

&lt;p&gt;Let’s say we want to have the following feedback options: Interesting, Mildly Interesting, Not Interesting and Boring. We could put our data migration in the same migration file that we generated previously, but let’s create another migration file specifically for this data migration...&lt;/p&gt;

&lt;p&gt;... &lt;em&gt;check out &lt;a href="https://kite.com/blog/python/django-database-migrations-overview/"&gt;the code on Kite's blog&lt;/a&gt;! Continue with "Bonus: Data Migrations"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Damian Hites is the CTO of Sylo, which is looking to improve Social Media Marketing by offering 3rd party trusted&lt;/em&gt; &lt;em&gt;measurement. He has 10+ years of experience writing software and leading teams.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>django</category>
      <category>webdev</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Django Templates: Best Practices</title>
      <dc:creator>Aaron Harris</dc:creator>
      <pubDate>Tue, 15 Oct 2019 19:10:50 +0000</pubDate>
      <link>https://dev.to/kite/django-templates-best-practices-1e25</link>
      <guid>https://dev.to/kite/django-templates-best-practices-1e25</guid>
      <description>&lt;h2&gt;
  
  
  Introduction to Django templates
&lt;/h2&gt;

&lt;p&gt;Django, as a web framework, uses templates as a way of producing static HTML from the output of a Django view. In practice, Django’s templates are simply HTML files, with some special syntax and a set of tools which lets Django render the HTML page on-the-fly for the visiting user. Templates are highly customizable, but are meant to be simple, with most of the “heavy” logic going into the view. Let’s dive deeper and learn some standard ways of dealing with common problems.&lt;/p&gt;

&lt;h2&gt;
  
  
  Simple start with Django templates
&lt;/h2&gt;

&lt;p&gt;By default, Django comes with a ton of built-in template &lt;em&gt;tags&lt;/em&gt; and &lt;em&gt;filters&lt;/em&gt; that help us perform repeatable template tasks throughout our apps. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tags:&lt;/strong&gt; Tags provide arbitrary logic in the rendering process. Django leaves this definition fairly vague, but tags are able to output content, grab content from the database (more on this later), or perform control operations like if statements or for loops.&lt;/p&gt;

&lt;p&gt;Examples of tags*:*&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight jinja"&gt;&lt;code&gt;&lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="nv"&gt;firstof&lt;/span&gt; &lt;span class="nv"&gt;user.is_active&lt;/span&gt; &lt;span class="nv"&gt;user.is_staff&lt;/span&gt; &lt;span class="nv"&gt;user.is_deleted&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;The &lt;a href="https://kite.com/python/docs/django.template.defaulttags.firstof"&gt;&lt;code&gt;firstof&lt;/code&gt;&lt;/a&gt; tag will output the first provided variable which evaluates to &lt;code&gt;True&lt;/code&gt;. This is a good replacement for a large &lt;code&gt;if/elif/elif/elif/elif&lt;/code&gt; block that’s just evaluating on truthiness within your Django templates.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight jinja"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;ul&amp;gt;&lt;/span&gt;
&lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="nv"&gt;product&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nv"&gt;product_list&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;li&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;{{&lt;/span&gt; &lt;span class="nv"&gt;product.name&lt;/span&gt; &lt;span class="cp"&gt;}}&lt;/span&gt;: $&lt;span class="cp"&gt;{{&lt;/span&gt; &lt;span class="nv"&gt;product.price&lt;/span&gt; &lt;span class="cp"&gt;}}&lt;/span&gt;&lt;span class="nt"&gt;&amp;lt;/li&amp;gt;&lt;/span&gt;
&lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;endfor&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/ul&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;for&lt;/code&gt; tag in Django will loop over each item in a list, making that item (&lt;em&gt;product&lt;/em&gt;, in this case) available in the template context before the tag is closed with &lt;code&gt;endfor&lt;/code&gt;. This is a widely used pattern when working with lists of Django model instances which have been returned from the view.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Filters:&lt;/strong&gt; Filters transform the values of variables and arguments. Filters would be used in tasks like rendering a string in uppercase or formatting a date string into a user’s region.&lt;/p&gt;

&lt;p&gt;Examples of filters*:*&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight jinja"&gt;&lt;code&gt;&lt;span class="cp"&gt;{{&lt;/span&gt; &lt;span class="nv"&gt;value&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nf"&gt;date&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;&lt;span class="s1"&gt;'D d M Y'&lt;/span&gt; &lt;span class="cp"&gt;}}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;The &lt;a href="https://kite.com/python/docs/django.template.defaultfilters.date"&gt;&lt;strong&gt;date&lt;/strong&gt;&lt;/a&gt; filter will format a date (&lt;code&gt;value&lt;/code&gt;, in the example) given a string with some format characters. The example would output the string: &lt;code&gt;Mon 01 Apr 2019&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight jinja"&gt;&lt;code&gt;&lt;span class="cp"&gt;{{&lt;/span&gt; &lt;span class="nv"&gt;value&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nf"&gt;slugify&lt;/span&gt; &lt;span class="cp"&gt;}}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;The &lt;a href="https://kite.com/python/docs/django.template.defaultfilters.slugify"&gt;&lt;strong&gt;slugify&lt;/strong&gt;&lt;/a&gt; filter will convert the spaces of a string into hyphens and convert the string to lowercase, among other things. The output of this example &lt;code&gt;would-look-something-like-this&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Project structure
&lt;/h2&gt;

&lt;p&gt;Django, by default, will make some assumptions about the structure of our project when it’s looking for templates. Knowing this, we can set up our project with a &lt;em&gt;template directory&lt;/em&gt; and &lt;em&gt;application template directories&lt;/em&gt;. &lt;/p&gt;

&lt;p&gt;Imagine a project, cloud*,* with the following structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;cloud/
    accounts/
        urls.py
        models.py
        views.py
        templates/
            accounts/
                login.html
                register.html
    blog/
        urls.py
        views.py
        models.py
        templates/
            blog/
                create.html
                post.html
                list.html
    config/
        settings/
            base.py
            local.py
        urls.py
    manage.py
    templates/
        includes/
            messages.html
            modal.html
        base.html
        logged_in.html
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h2&gt;
  
  
  How inheritance works for Django templates
&lt;/h2&gt;

&lt;p&gt;An important aspect of Django’s templating system is &lt;em&gt;template inheritance&lt;/em&gt;. Django applications are meant to be reusable, and we can apply the same methodology to our templates by inheriting common HTML from other templates.&lt;/p&gt;

&lt;p&gt;A typical pattern is to have a common base template for common aspects of your application, logged-in pages, logged-out pages, or in places where significant changes are made to the underlying HTML. From our example above, &lt;code&gt;base.html&lt;/code&gt;would contain most of the core structure that would make up each page, with &lt;em&gt;blocks&lt;/em&gt; defined for app or page-specific customizations. &lt;/p&gt;

&lt;p&gt;For example, &lt;code&gt;base.html&lt;/code&gt; may contain:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight jinja"&gt;&lt;code&gt;&lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="nv"&gt;load&lt;/span&gt; &lt;span class="nv"&gt;static&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;
&lt;span class="cp"&gt;&amp;lt;!doctype html&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;html&lt;/span&gt; &lt;span class="na"&gt;lang=&lt;/span&gt;&lt;span class="s"&gt;"en"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;head&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;meta&lt;/span&gt; &lt;span class="na"&gt;charset=&lt;/span&gt;&lt;span class="s"&gt;"utf-8"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;meta&lt;/span&gt; &lt;span class="na"&gt;name=&lt;/span&gt;&lt;span class="s"&gt;"viewport"&lt;/span&gt; &lt;span class="na"&gt;content=&lt;/span&gt;&lt;span class="s"&gt;"width=device-width, initial-scale=1, shrink-to-fit=no"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;block&lt;/span&gt; &lt;span class="nv"&gt;page_meta&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;
  &lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;endblock&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;

  &lt;span class="c"&gt;{# Vendor styles #}&lt;/span&gt;
  &lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;block&lt;/span&gt; &lt;span class="nv"&gt;vendor_css&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;link&lt;/span&gt; &lt;span class="na"&gt;rel=&lt;/span&gt;&lt;span class="s"&gt;"stylesheet"&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;"text/css"&lt;/span&gt; &lt;span class="na"&gt;media=&lt;/span&gt;&lt;span class="s"&gt;"all"&lt;/span&gt; &lt;span class="na"&gt;href=&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="nv"&gt;static&lt;/span&gt; &lt;span class="s1"&gt;'css/vendor.css'&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;
  &lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;endblock&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;

  &lt;span class="c"&gt;{# Global styles #}&lt;/span&gt;
  &lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;block&lt;/span&gt; &lt;span class="nv"&gt;site_css&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;link&lt;/span&gt; &lt;span class="na"&gt;rel=&lt;/span&gt;&lt;span class="s"&gt;"stylesheet"&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;"text/css"&lt;/span&gt; &lt;span class="na"&gt;media=&lt;/span&gt;&lt;span class="s"&gt;"all"&lt;/span&gt; &lt;span class="na"&gt;href=&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="nv"&gt;static&lt;/span&gt; &lt;span class="s1"&gt;'css/application.css'&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;
  &lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;endblock&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;

  &lt;span class="c"&gt;{# Page-specific styles #}&lt;/span&gt;
  &lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;autoescape&lt;/span&gt; &lt;span class="nv"&gt;off&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;
    &lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;block&lt;/span&gt; &lt;span class="nv"&gt;page_css&lt;/span&gt; &lt;span class="cp"&gt;%}{%&lt;/span&gt; &lt;span class="k"&gt;endblock&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;
  &lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;endautoescape&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;

  &lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;block&lt;/span&gt; &lt;span class="nv"&gt;extra_head&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;
    &lt;span class="c"&gt;{# Extra header stuff (scripts, styles, metadata, etc) #}&lt;/span&gt;
  &lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;endblock&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;

  &lt;span class="nt"&gt;&amp;lt;title&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;block&lt;/span&gt; &lt;span class="nv"&gt;page_title&lt;/span&gt; &lt;span class="cp"&gt;%}{%&lt;/span&gt; &lt;span class="k"&gt;endblock&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;&lt;span class="nt"&gt;&amp;lt;/title&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/head&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;body&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;block&lt;/span&gt; &lt;span class="nv"&gt;body_class&lt;/span&gt; &lt;span class="cp"&gt;%}{%&lt;/span&gt; &lt;span class="k"&gt;endblock&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;block&lt;/span&gt; &lt;span class="nv"&gt;body&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;
    &lt;span class="c"&gt;{# Page content will go here #}&lt;/span&gt;
&lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;endblock&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;

&lt;span class="c"&gt;{# Modal HTML #}&lt;/span&gt;
&lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;block&lt;/span&gt; &lt;span class="nv"&gt;modals&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;
&lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;endblock&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;

&lt;span class="c"&gt;{# Vendor javascript #}&lt;/span&gt;
&lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;block&lt;/span&gt; &lt;span class="nv"&gt;vendor_js&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;script &lt;/span&gt;&lt;span class="na"&gt;src=&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="nv"&gt;static&lt;/span&gt; &lt;span class="s1"&gt;'js/vendor.js'&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&amp;lt;/script&amp;gt;&lt;/span&gt;
&lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;endblock&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;

&lt;span class="c"&gt;{# Global javascript #}&lt;/span&gt;
&lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;block&lt;/span&gt; &lt;span class="nv"&gt;site_js&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;script &lt;/span&gt;&lt;span class="na"&gt;src=&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="nv"&gt;static&lt;/span&gt; &lt;span class="s1"&gt;'js/application.js'&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&amp;lt;/script&amp;gt;&lt;/span&gt;
&lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;endblock&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;

&lt;span class="c"&gt;{# Shared data for javascript #}&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;script &lt;/span&gt;&lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;"text/javascript"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="nb"&gt;window&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_sharedData&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;autoescape&lt;/span&gt; &lt;span class="nv"&gt;off&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;
      &lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;block&lt;/span&gt; &lt;span class="nv"&gt;shared_data&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;
        &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;DEBUG&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nv"&gt;debug&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;endif&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;endblock&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;
    &lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;endautoescape&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/script&amp;gt;&lt;/span&gt;

&lt;span class="c"&gt;{# Page javascript #}&lt;/span&gt;
&lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;autoescape&lt;/span&gt; &lt;span class="nv"&gt;off&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;
  &lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;block&lt;/span&gt; &lt;span class="nv"&gt;page_js&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;
  &lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;endblock&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;
&lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;endautoescape&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/body&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/html&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;There are a few things done in this example specifically for the sake of inheritance. Most notably, this base template has blocks defined for nearly every customizable aspect of the underlying HTML. Blocks for including CSS, JavaScript, an HTML title, meta tags, and more are all defined.&lt;/p&gt;

&lt;p&gt;We use Django’s &lt;code&gt;autoescape&lt;/code&gt; template tag surrounding blocks where we don’t want Django to autoescape our HTML tags or JavaScript, but rather treat the contents of the block literally.&lt;/p&gt;

&lt;p&gt;Our &lt;code&gt;shared_data&lt;/code&gt; block allows us to populate a global JavaScript object with variables and data which we may want to share between Django and any running JavaScript on the page (populating React or Vue.js components, for example.)&lt;/p&gt;

&lt;p&gt;For example, if we wanted to pass a Django URL to one of our JavaScript files, we could do something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight jinja"&gt;&lt;code&gt;&lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;extends&lt;/span&gt; &lt;span class="s1"&gt;'base.html'&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;

&lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;block&lt;/span&gt; &lt;span class="nv"&gt;shared_data&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;
  &lt;span class="cp"&gt;{{&lt;/span&gt; &lt;span class="nv"&gt;block.super&lt;/span&gt; &lt;span class="cp"&gt;}}&lt;/span&gt;
  'USERS_AUTOCOMPLETE_ENDPOINT': '&lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="nv"&gt;url&lt;/span&gt; &lt;span class="s1"&gt;'api:users:autocomplete'&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;',
&lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;endblock&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Django loads the page and returns in a JavaScript object that you can then use within the JavaScript files on the page:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight jinja"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;script &lt;/span&gt;&lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;"text/javascript"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="nb"&gt;window&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_sharedData&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;      
        &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;DEBUG&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;USERS_AUTOCOMPLETE_ENDPOINT&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/users/autocomplete/&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/script&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;The inside of a JS console once the page has loaded:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; window._sharedData.DEBUG
&lt;span class="nb"&gt;false&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; window._sharedData.USERS_AUTOCOMPLETE_ENDPOINT
&lt;span class="s1"&gt;'/api/users/autocomplete/'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h2&gt;
  
  
  Handling querysets
&lt;/h2&gt;

&lt;p&gt;Properly handling querysets within your templates can be a performance bottleneck for Django depending on the complexities of your model definitions. &lt;/p&gt;

&lt;p&gt;Django’s templating system is tightly coupled with Django’s object-relational mapping layer which returns us data from the database. Without proper consideration of this coupling you may, inadvertently, cause the number of queries run on each page load to jump to unmaintainable amounts. In some cases, this can cause the database to become too sluggish to operate certain pages on your site, or worse, crash and need to be restarted.&lt;/p&gt;

&lt;p&gt;Thankfully, Django provides mechanisms and patterns which we can use to make sure our templates are running as fast as possible and we’re not killing the database server. &lt;/p&gt;

&lt;p&gt;Consider this common Django pattern:&lt;/p&gt;

&lt;p&gt;accounts/views.py&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;UserListView&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ListView&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;template_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;'accounts/list.html'&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;User&lt;/span&gt;
    &lt;span class="n"&gt;paginate_by&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;25&lt;/span&gt;
    &lt;span class="n"&gt;context_object_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;'users'&lt;/span&gt;
    &lt;span class="n"&gt;queryset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;User&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;objects&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;all&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;accounts/templates/accounts/list.html&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight jinja"&gt;&lt;code&gt;...
&lt;span class="nt"&gt;&amp;lt;table&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;thead&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;tr&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;th&amp;gt;&lt;/span&gt;Username&lt;span class="nt"&gt;&amp;lt;/th&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;th&amp;gt;&lt;/span&gt;Email&lt;span class="nt"&gt;&amp;lt;/th&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;th&amp;gt;&lt;/span&gt;Profile photo URL&lt;span class="nt"&gt;&amp;lt;/th&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;th&amp;gt;&lt;/span&gt;Joined&lt;span class="nt"&gt;&amp;lt;/th&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;/tr&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;/thead&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;tbody&amp;gt;&lt;/span&gt;
  &lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="nv"&gt;user&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nv"&gt;users&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;tr&amp;gt;&lt;/span&gt;
      &lt;span class="nt"&gt;&amp;lt;td&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;{{&lt;/span&gt; &lt;span class="nv"&gt;user.username&lt;/span&gt; &lt;span class="cp"&gt;}}&lt;/span&gt;&lt;span class="nt"&gt;&amp;lt;/td&amp;gt;&lt;/span&gt;
      &lt;span class="nt"&gt;&amp;lt;td&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;{{&lt;/span&gt; &lt;span class="nv"&gt;user.email_address&lt;/span&gt; &lt;span class="cp"&gt;}}&lt;/span&gt;&lt;span class="nt"&gt;&amp;lt;/td&amp;gt;&lt;/span&gt;
      &lt;span class="nt"&gt;&amp;lt;td&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;{{&lt;/span&gt; &lt;span class="nv"&gt;user.profile.avatar_url&lt;/span&gt; &lt;span class="cp"&gt;}}&lt;/span&gt;&lt;span class="nt"&gt;&amp;lt;/td&amp;gt;&lt;/span&gt;
      &lt;span class="nt"&gt;&amp;lt;td&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;{{&lt;/span&gt; &lt;span class="nv"&gt;user.created_at&lt;/span&gt; &lt;span class="cp"&gt;}}&lt;/span&gt;&lt;span class="nt"&gt;&amp;lt;/td&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;/tr&amp;gt;&lt;/span&gt;
  &lt;span class="cp"&gt;{%&lt;/span&gt; &lt;span class="k"&gt;endfor&lt;/span&gt; &lt;span class="cp"&gt;%}&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;/tbody&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/table&amp;gt;&lt;/span&gt;
...
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Can you spot the problem? It may not be obvious at first, but look at this line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight jinja"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;td&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;{{&lt;/span&gt; &lt;span class="nv"&gt;user.profile.avatar_url&lt;/span&gt; &lt;span class="cp"&gt;}}&lt;/span&gt;&lt;span class="nt"&gt;&amp;lt;/td&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;When Django is processing and rendering our template (line by line), it will need to do an additional query to grab information from the &lt;em&gt;profile&lt;/em&gt; object as it’s a related field. In our example view, we’re paginating by 25 users, so this one line in the template could account for an additional 25 queries (on each page request as the profile object, as with all related objects and models in Django) which aren’t included in the original query for the 25 users. You can imagine how this could become a very slow page if we were including fields from other related objects in our table, or if we were paginating by 100 users instead of 25.&lt;/p&gt;

&lt;p&gt;To resolve this, we’ll change one line in our view, &lt;code&gt;accounts/views.py&lt;/code&gt;, to &lt;a href="https://kite.com/python/docs/django.db.models.query.QuerySet.select_related"&gt;&lt;strong&gt;select related&lt;/strong&gt;&lt;/a&gt; objects when we’re running our original query for users:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;UserListView&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ListView&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;template_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;'accounts/list.html'&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;User&lt;/span&gt;
    &lt;span class="n"&gt;paginate_by&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;25&lt;/span&gt;
    &lt;span class="n"&gt;context_object_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;'users'&lt;/span&gt;
    &lt;span class="n"&gt;queryset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;User&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;objects&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;select_related&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'profile'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;By replacing our &lt;code&gt;User.objects.all()&lt;/code&gt; with &lt;code&gt;User.objects.select_related(‘profile’)&lt;/code&gt;, we’re telling Django to include related profile instances when it’s performing its query for our users. This will include the &lt;code&gt;Profile&lt;/code&gt; model on each &lt;code&gt;User&lt;/code&gt; instance, preventing Django from needing to run an extra query each time we ask for information from the profile within the template.&lt;/p&gt;

&lt;p&gt;Django’s &lt;code&gt;select_related&lt;/code&gt; functionality does not work with many-to-many model relationships, or with many-to-one relationships. For this, we’d want to use Django’s &lt;a href="https://docs.djangoproject.com/en/2.1/ref/models/querysets/#prefetch-related"&gt;&lt;code&gt;prefetch_related&lt;/code&gt;&lt;/a&gt; method.&lt;/p&gt;

&lt;p&gt;Unlike &lt;code&gt;select_related&lt;/code&gt;, &lt;code&gt;prefetch_related&lt;/code&gt; does its magic in Python, as opposed to SQL select statements, by joining related objects into instances which can be accessed in templates as we’ve done above. It doesn’t perform things in a single query like &lt;code&gt;select_related&lt;/code&gt; is able to, but it’s much more efficient than running a query each time you request a related attribute.&lt;/p&gt;

&lt;p&gt;A prefetch for related &lt;strong&gt;projects&lt;/strong&gt; and &lt;strong&gt;organizations&lt;/strong&gt; and one-to-many relationships off of the &lt;code&gt;User&lt;/code&gt; model would look like this:&lt;/p&gt;

&lt;p&gt;... check out the &lt;a href="https://kite.com/blog/python/django-templates-best-practices/"&gt;full article and source code&lt;/a&gt; from kite.com!&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Zac Clancy is Vice President at Global DIRT (Disaster Immediate Response Team)&lt;/em&gt;&lt;/p&gt;

</description>
      <category>django</category>
      <category>python</category>
      <category>webdev</category>
      <category>beginners</category>
    </item>
    <item>
      <title>Using Custom Authentication Backends in Django</title>
      <dc:creator>Aaron Harris</dc:creator>
      <pubDate>Tue, 08 Oct 2019 18:02:24 +0000</pubDate>
      <link>https://dev.to/kite/using-custom-authentication-backends-in-django-32pn</link>
      <guid>https://dev.to/kite/using-custom-authentication-backends-in-django-32pn</guid>
      <description>&lt;p&gt;Many organizations use widely-adopted authentication systems provided by services like Google, Facebook, or GitHub. A few Python packages provide authentication integration with these services, but most of them expect you to be handling the final user accounts on with Django. What happens when you need to work with user accounts that live in another system altogether?&lt;/p&gt;

&lt;p&gt;In this article, you’ll see the interface that Django exposes for authenticating to an external system. By the end, you should understand the pieces involved in mapping an external system’s information to Django’s native &lt;code&gt;User&lt;/code&gt; objects in order to work with them on your own site.&lt;/p&gt;

&lt;h2&gt;
  
  
  Django’s default authentication
&lt;/h2&gt;

&lt;p&gt;In the &lt;a href="https://kite.com/blog/python/django-authentication/"&gt;Django User Authentication System&lt;/a&gt;, we covered the basics of how default authentication works in Django. Ultimately, you can interact with &lt;code&gt;User&lt;/code&gt; objects and understand if a user &lt;code&gt;is_authenticated&lt;/code&gt; or not. Using the default authentication system, you can make use of many of Django’s built-in features like its login and logout views and password reset workflow.&lt;/p&gt;

&lt;p&gt;When working with an external authentication system, you have to manage these pieces yourself. Some of them may not make sense to you depending on how your authentication system works.&lt;/p&gt;

&lt;h2&gt;
  
  
  Authentication backends
&lt;/h2&gt;

&lt;p&gt;As with many of Django’s systems, authentication is modeled as a plugin system. Django will try to authenticate users through a series of authentication backends. The default backend checks a user’s username and password against all the existing &lt;code&gt;User&lt;/code&gt; objects in the database to authenticate them. The &lt;code&gt;AUTHENTICATION_BACKENDS&lt;/code&gt; setting is your entrypoint to intercept this workflow and point Django to your external system.&lt;/p&gt;

&lt;p&gt;An authentication backend is a class that, minimally, implements two methods:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;get_user(user_id)&lt;/code&gt; — a &lt;code&gt;user_id&lt;/code&gt; can be whatever unique identifier your external system uses to distinguish users, and &lt;code&gt;get_user&lt;/code&gt; returns either a user object matching the given &lt;code&gt;user_id&lt;/code&gt; or &lt;code&gt;None&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;authenticate(request, **credentials)&lt;/code&gt; — the &lt;code&gt;request&lt;/code&gt; is the current HTTP request, and the credentials keyword arguments are whatever credentials your external system needs to check if a user should be authenticated or not. This is often a username and password, but it could be an API token or some other scheme. &lt;code&gt;authenticate&lt;/code&gt; returns an authenticated &lt;code&gt;User&lt;/code&gt; object or &lt;code&gt;None&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Inside your authentication backend’s authenticate method, you can pass along the credentials to your external system via a REST API or another common authentication scheme like LDAP or SAML.&lt;/p&gt;

&lt;p&gt;Using the wonderful &lt;a href="https://yesno.wtf/"&gt;Yes or No?&lt;/a&gt; API, you could build an authentication backend that authenticates a user occasionally if the API permits:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;requests&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;FickleAuthBackend&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;authenticate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;username&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="s"&gt;'https://yesno.wtf/api/'&lt;/span&gt;
        &lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;User&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;username&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;username&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;password&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'answer'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;'yes'&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;While &lt;code&gt;authenticate&lt;/code&gt; can return a user object or &lt;code&gt;None&lt;/code&gt;, it may also return an &lt;code&gt;AnonymousUser&lt;/code&gt; object, or raise &lt;code&gt;PermissionDenied&lt;/code&gt; to explicitly halt any further authentication checks. This allows for a variety of ways to proceed, and anonymous users may still have certain permissions. You’ll want to account for that in your middleware and views.&lt;/p&gt;

&lt;p&gt;If the external user service provides additional information about the user, &lt;code&gt;get_user&lt;/code&gt; might be a good place to grab some of that data. You can add attributes to the user object in &lt;code&gt;authenticate&lt;/code&gt; before you return it if you’d like, but be careful of how many attributes you add dynamically.&lt;/p&gt;

&lt;h2&gt;
  
  
  Permissions
&lt;/h2&gt;

&lt;p&gt;I also covered Django’s permission scheme in The Django User Authentication System: when given a user, you can inquire about their permissions generally or against specific objects using the &lt;code&gt;has_perm&lt;/code&gt; method. Custom authentication backends can override permission checking methods and Django will check against those first before falling back to its default checks. This allows you to make queries to your external system about permissions in addition to authentication:&lt;/p&gt;

&lt;p&gt;... &lt;a href="https://kite.com/blog/python/custom-django-authentication/"&gt;continue with Permissions&lt;/a&gt; see the code, and more Django tutorials by Dane!&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Dane Hillard has an upcoming book "Practices of the Python Pro" coming this month (October 2019)&lt;/em&gt;&lt;/p&gt;

</description>
      <category>django</category>
      <category>python</category>
      <category>tutorial</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Advanced Django Models: Improve Your Python Development</title>
      <dc:creator>Aaron Harris</dc:creator>
      <pubDate>Fri, 04 Oct 2019 17:25:01 +0000</pubDate>
      <link>https://dev.to/kite/advanced-django-models-improve-your-python-development-2i6k</link>
      <guid>https://dev.to/kite/advanced-django-models-improve-your-python-development-2i6k</guid>
      <description>&lt;p&gt;Models are a core concept of the Django framework. According to Django’s &lt;a href="https://docs.djangoproject.com/en/2.2/misc/design-philosophies/#models"&gt;design philosophies for models&lt;/a&gt;, we should be as explicit as possible with the naming and functionality of our fields, and ensure that we’re including all relevant functionality related to our model in the model itself, rather than in the views or somewhere else. If you’ve worked with Ruby on Rails before, these design philosophies won’t seem new as both Rails and Django implement the &lt;a href="https://www.martinfowler.com/eaaCatalog/activeRecord.html"&gt;Active Record pattern&lt;/a&gt; for their object-relational mapping (ORM) systems to handle stored data. &lt;/p&gt;

&lt;p&gt;In this post we’ll look at some ways to leverage these philosophies, core Django features, and even some libraries to help make our models better.&lt;/p&gt;

&lt;p&gt;&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/zH1OL-va9wI"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;code&gt;getter/setter/deleter&lt;/code&gt; properties
&lt;/h2&gt;

&lt;p&gt;As a feature of Python since version 2.2, a property’s usage looks like an attribute but is actually a method. While using a property on a model isn’t that advanced, we can use some underutilized features of the Python property to make our models more powerful. &lt;/p&gt;

&lt;p&gt;If you’re using Django’s built-in authentication or have customized your authentication using &lt;code&gt;AbstractBaseUser&lt;/code&gt;, you’re probably familiar with the &lt;code&gt;last_login&lt;/code&gt; field defined on the &lt;code&gt;User&lt;/code&gt; model, which is a saved timestamp of the user’s last login to your application. If we want to use &lt;code&gt;last_login&lt;/code&gt;, but also have a field named &lt;code&gt;last_seen&lt;/code&gt; saved to a cache more frequently, we could do so pretty easily.&lt;/p&gt;

&lt;p&gt;First, we’ll make a Python &lt;em&gt;property&lt;/em&gt; that finds a value in the cache, and if it can’t, it returns the value from the database.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;accounts&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;py&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;django.contrib.auth.base_user&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AbstractBaseUser&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;django.core.cache&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt;


&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;User&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;AbstractBaseUser&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;

    &lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="nb"&gt;property&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;last_seen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="s"&gt;"""
        Returns the 'last_seen' value from the cache for a User.
        """&lt;/span&gt;
        &lt;span class="n"&gt;last_seen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'last_seen_{0}'&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pk&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

        &lt;span class="c1"&gt;# Check cache result, otherwise return the database value
&lt;/span&gt;        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;last_seen&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;last_seen&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;last_login&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Note: I’ve slimmed the model down a bit as there’s a separate tutorial on this blog about specifically &lt;a href="https://kite.com/blog/python/custom-django-user-model/"&gt;customizing the built-in Django user model&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The property above checks our cache for the user’s &lt;code&gt;last_seen&lt;/code&gt; value, and if it doesn’t find anything, it will return the user’s stored &lt;code&gt;last_login&lt;/code&gt; value from the model. Referencing &lt;code&gt;&amp;lt;instance&amp;gt;.last_seen&lt;/code&gt; now provides a much more customizable attribute on our model behind a very simple interface.&lt;/p&gt;

&lt;p&gt;We can expand this to include custom behavior when a value is assigned to our property (&lt;code&gt;some_user.last_seen = some_date_time&lt;/code&gt;), or when a value is deleted from the property (&lt;code&gt;del some_user.last_seen&lt;/code&gt;).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;last_seen&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;setter&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;last_seen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="s"&gt;"""
    Sets the 'last_seen_[uuid]' value in the cache for a User.
    """&lt;/span&gt;
    &lt;span class="n"&gt;now&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;

    &lt;span class="c1"&gt;# Save in the cache
&lt;/span&gt;    &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'last_seen_{0}'&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pk&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;last_seen&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;deleter&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;last_seen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="s"&gt;"""
    Removes the 'last_seen' value from the cache.
    """&lt;/span&gt;
    &lt;span class="c1"&gt;# Delete the cache key
&lt;/span&gt;    &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'last_seen_{0}'&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pk&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Now, whenever a value is assigned to our &lt;code&gt;last_seen&lt;/code&gt; property, we save it to the cache, and when a value is removed with &lt;code&gt;del&lt;/code&gt;, we remove it from the cache. Using &lt;code&gt;setter&lt;/code&gt; and &lt;code&gt;deleter&lt;/code&gt; is described in the Python documentation but is rarely seen in the wild when looking at Django models. &lt;/p&gt;

&lt;p&gt;You may have a use case like this one, where you want to store something that doesn’t necessarily need to be persisted to a traditional database, or for performance reasons, shouldn’t be. Using a custom property like the above example is a great solution.&lt;/p&gt;

&lt;p&gt;In a similar use case, the &lt;code&gt;python-social-auth&lt;/code&gt; library, a tool for managing user authentication using third-party platforms like GitHub and Twitter, will create and manage updating information in your database based on information from the platform the user logged-in with. In some cases, the information returned won’t match the fields in our database. For example, the &lt;code&gt;python-social-auth&lt;/code&gt; library will pass a &lt;code&gt;fullname&lt;/code&gt; keyword argument when creating the user. If, perhaps in our database, we used &lt;code&gt;full_name&lt;/code&gt; as our attribute name then we might be in a pinch.&lt;/p&gt;

&lt;p&gt;A simple way around this is by using the &lt;code&gt;getter/setter&lt;/code&gt; pattern from above:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="nb"&gt;property&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fullname&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;full_name&lt;/span&gt;

&lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;fullname&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;setter&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fullname&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;full_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Now, when &lt;code&gt;python-social-auth&lt;/code&gt; saves a user’s &lt;code&gt;fullname&lt;/code&gt; to our model (&lt;code&gt;new_user.fullname = 'Some User'&lt;/code&gt;), we’ll intercept it and save it to our database field, &lt;code&gt;full_name&lt;/code&gt;, instead.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;code&gt;through&lt;/code&gt; model relationships
&lt;/h2&gt;

&lt;p&gt;Django’s &lt;a href="https://docs.djangoproject.com/en/2.2/topics/db/examples/many_to_many/#many-to-many-relationships"&gt;many-to-many relationships&lt;/a&gt; are a great way of handling complex object relationships simply, but they don’t afford us the ability to add custom attributes to the &lt;code&gt;intermediate models&lt;/code&gt; they create. By default, this simply includes an identifier and two foreign key references to join the objects together.&lt;/p&gt;

&lt;p&gt;Using the Django &lt;a href="https://docs.djangoproject.com/en/2.2/topics/db/models/#extra-fields-on-many-to-many-relationships"&gt;&lt;code&gt;ManyToManyField through&lt;/code&gt;&lt;/a&gt; parameter, we can create this intermediate model ourselves and add any additional fields we deem necessary. &lt;/p&gt;

&lt;p&gt;If our application, for example, not only needed users to have memberships within groups, but wanted to track when that membership started, we could use a custom intermediate model to do so.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;accounts&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;py&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;uuid&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;django.contrib.auth.base_user&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AbstractBaseUser&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;django.db&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;django.utils.timezone&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;


&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;User&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;AbstractBaseUser&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nb"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UUIDField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;primary_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;uuid4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;editable&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="err"&gt;…&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Group&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Model&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nb"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UUIDField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;primary_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;uuid4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;editable&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;members&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ManyToManyField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;User&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;through&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'Membership'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Membership&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Model&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nb"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UUIDField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;primary_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;uuid4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;editable&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ForeignKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;User&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;on_delete&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CASCADE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;group&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ForeignKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Group&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;on_delete&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CASCADE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;joined&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DateTimeField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;editable&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;In the example above, we’re still using a &lt;code&gt;ManyToManyField&lt;/code&gt; to handle the relationship between a user and a group, but by passing the &lt;code&gt;Membership&lt;/code&gt; model using the &lt;code&gt;through&lt;/code&gt; keyword argument, we can now add our &lt;code&gt;joined&lt;/code&gt; custom attribute to the model to track when the group membership was started. This &lt;code&gt;through&lt;/code&gt; model is a standard Django model, it just requires a primary key (we use UUIDs here), and two foreign keys to join the objects together.&lt;/p&gt;

&lt;p&gt;Using the same three model pattern, we could create a simple subscription database for our site:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;uuid&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;django.contrib.auth.base_user&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AbstractBaseUser&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;django.db&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;django.utils.timezone&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;


&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;User&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;AbstractBaseUser&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nb"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UUIDField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;primary_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;uuid4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;editable&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Plan&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Model&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nb"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UUIDField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;primary_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;uuid4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;editable&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CharField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_length&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;unique&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'free'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;subscribers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ManyToManyField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;User&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;through&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'Subscription'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;related_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'subscriptions'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;related_query_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'subscriptions'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Subscription&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Model&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nb"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UUIDField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;primary_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;uuid4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;editable&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ForeignKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;User&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;on_delete&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CASCADE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;plan&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ForeignKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Plan&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;on_delete&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CASCADE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;created&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DateTimeField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;editable&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;updated&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DateTimeField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;auto_now&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;cancelled&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DateTimeField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;blank&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;null&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Here we’re able to track when a user first subscribed, when they updated their subscription, and if we added the code paths for it, when a user canceled their subscription to our application.&lt;/p&gt;

&lt;p&gt;Using &lt;code&gt;through&lt;/code&gt; models with the &lt;code&gt;ManyToManyField&lt;/code&gt; is a great way to add more data to our intermediate models and provide a more thorough experience for our users without much added work.&lt;/p&gt;

&lt;h2&gt;
  
  
  Proxy models
&lt;/h2&gt;

&lt;p&gt;Normally in Django, when you subclass a model (this doesn’t include &lt;em&gt;abstract models&lt;/em&gt;) into a new class, the framework will create new database tables for that class and link them (via &lt;code&gt;OneToOneField&lt;/code&gt;) to the parent database tables. Django calls this “&lt;a href="https://docs.djangoproject.com/en/2.2/topics/db/models/#multi-table-inheritance"&gt;multi-table inheritance&lt;/a&gt;” and it’s a great way to re-use existing model fields and structures and add your own data to them. “Don’t repeat yourself,” as the Django design philosophies state.&lt;/p&gt;

&lt;p&gt;Multi-table inheritance example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;django.db&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Vehicle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Model&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CharField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_length&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;manufacturer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CharField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_length&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;year&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;IntegerField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_length&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Airplane&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Vehicle&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;is_cargo&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;BooleanField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;is_passenger&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;BooleanField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;This example would create ...&lt;/p&gt;

&lt;p&gt;...&lt;a href="https://kite.com/blog/python/advanced-django-models-python-overview/"&gt;continue with Proxy Models&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Zac Clancy is Vice President at Global DIRT (Disaster Immediate Response Team)&lt;/em&gt;&lt;/p&gt;

</description>
      <category>django</category>
      <category>python</category>
      <category>webdev</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Handling Imbalanced Datasets with SMOTE in Python</title>
      <dc:creator>Aaron Harris</dc:creator>
      <pubDate>Tue, 01 Oct 2019 19:04:28 +0000</pubDate>
      <link>https://dev.to/kite/handling-imbalanced-datasets-with-smote-in-python-3h6h</link>
      <guid>https://dev.to/kite/handling-imbalanced-datasets-with-smote-in-python-3h6h</guid>
      <description>&lt;p&gt;Close your eyes. &lt;/p&gt;

&lt;p&gt;Now imagine a perfect data world. What do you see? What do you wish to see? Exactly, me too. A flawlessly balanced dataset. A collection of data whose labels form a magnificent 1:1 ratio: 50% of this, 50% of that; not a bit to the left, nor a bit to the right. Just perfectly balanced, as all things should be. Now open your eyes, and come back to the real world.&lt;/p&gt;

&lt;p&gt;The opposite of a pure balanced dataset is a highly imbalanced dataset, and unfortunately for us, these are quite common. An imbalanced dataset is a dataset where the number of data points per class differs drastically, resulting in a heavily biased machine learning model that won’t be able to learn the minority class. When this imbalanced ratio is not so heavily skewed toward one class, such dataset is not &lt;em&gt;that&lt;/em&gt; horrible, since many machine learning models can handle them. &lt;/p&gt;

&lt;p&gt;Nevertheless, there are some extreme cases in which the class ratio is just wrong, for example, a dataset where 95% of the labels belong to class A, while the remaining 5% fall under class B– a ratio not so rare in use cases such as fraud detection. In these extreme cases, the ideal course of action would be to collect more data. &lt;/p&gt;

&lt;p&gt;However, this is typically not feasible; in fact, it’s costly, time-consuming and in most cases, impossible. Luckily for us, there’s an alternative known as oversampling. Oversampling involves using the data we currently have to create more of it. &lt;/p&gt;

&lt;h2&gt;
  
  
  What is data oversampling?
&lt;/h2&gt;

&lt;p&gt;Data oversampling is a technique applied to generate data in such a way that it resembles the underlying distribution of the real data. In this article, I explain how we can use an oversampling technique called &lt;strong&gt;Synthetic Minority Over-Sampling Technique&lt;/strong&gt; or &lt;strong&gt;SMOTE&lt;/strong&gt; to balance out our dataset.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is SMOTE?
&lt;/h2&gt;

&lt;p&gt;SMOTE is an oversampling algorithm that relies on the concept of nearest neighbors to create its synthetic data. Proposed back in &lt;a href="https://arxiv.org/abs/1106.1813"&gt;2002 by Chawla et. al&lt;/a&gt;., SMOTE has become one of the most popular algorithms for oversampling. &lt;/p&gt;

&lt;p&gt;The simplest case of oversampling is simply called oversampling or upsampling, meaning a method used to duplicate randomly selected data observations from the outnumbered class. &lt;/p&gt;

&lt;p&gt;Oversampling’s purpose is for us to feel confident the data we generate are real examples of already existing data. This inherently comes with the issue of creating more of the same data we currently have, without adding any diversity to our dataset, and producing effects such as overfitting. &lt;/p&gt;

&lt;p&gt;Hence, if overfitting affects our training due to randomly generated, upsampled data– or if plain oversampling is not suitable for the task at hand– we could resort to another, smarter oversampling technique known as synthetic data generation.&lt;/p&gt;

&lt;p&gt;Synthetic data is intelligently generated artificial data that resembles the shape or values of the data it is intended to enhance. Instead of merely making new examples by &lt;em&gt;copying&lt;/em&gt; the data we already have (as explained in the last paragraph), a synthetic data generator &lt;em&gt;creates&lt;/em&gt; data that is similar to the existing one. Creating synthetic data is where SMOTE shines.&lt;/p&gt;

&lt;h2&gt;
  
  
  How does SMOTE work?
&lt;/h2&gt;

&lt;p&gt;To show how SMOTE works, suppose we have an imbalanced two-dimensional dataset, such as the one in the next image, and we want to use SMOTE to create new data points.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--feXVf4V---/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://kite.com/wp-content/uploads/2019/08/imbalance-dataset.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--feXVf4V---/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://kite.com/wp-content/uploads/2019/08/imbalance-dataset.jpg" alt="example of an imbalanced dataset"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Example of an imbalanced dataset&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;For each observation that belongs to the under-represented class, the algorithm gets its K-nearest-neighbors and synthesizes a new instance of the minority label at a random location in the line between the current observation and its nearest neighbor. &lt;/p&gt;

&lt;p&gt;In our example (shown in the next image), the blue encircled dot is the current observation, the blue non-encircled dot is its nearest neighbor, and the green dot is the synthetic one.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--dpQ79qWU--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://kite.com/wp-content/uploads/2019/08/SMOTEs-new-synthetic-data-point.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--dpQ79qWU--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://kite.com/wp-content/uploads/2019/08/SMOTEs-new-synthetic-data-point.jpg" alt="new synthetic data point generated by SMOTE"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;SMOTE’s new synthetic data point&lt;/p&gt;

&lt;p&gt;Now let’s do it in Python.&lt;/p&gt;

&lt;h2&gt;
  
  
  SMOTE tutorial using imbalanced-learn
&lt;/h2&gt;

&lt;p&gt;In this tutorial, I explain how to balance an imbalanced dataset using the package &lt;a href="https://imbalanced-learn.readthedocs.io/en/stable/"&gt;&lt;strong&gt;imbalanced-learn&lt;/strong&gt;&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;First, I create a perfectly balanced dataset and train a machine learning model with it which I’ll call our “&lt;strong&gt;base model&lt;/strong&gt;&lt;strong&gt;”&lt;/strong&gt;. Then, I’ll unbalance the dataset and train a second system which I’ll call an “&lt;strong&gt;imbalanced model&lt;/strong&gt;.”&lt;/p&gt;

&lt;p&gt;Finally, I’ll use SMOTE to balance out the dataset, followed by fitting a third model with it which I’ll name the “&lt;strong&gt;SMOTE’d&lt;/strong&gt;&lt;strong&gt;”&lt;/strong&gt; model. By training a new model at each step, We’ll be able to better understand how an imbalanced dataset can affect a machine learning system.&lt;/p&gt;

&lt;h3&gt;
  
  
  Base model
&lt;/h3&gt;

&lt;p&gt;&lt;em&gt;Example code for this article may be found at the&lt;/em&gt; &lt;a href="https://github.com/kiteco/kite-python-blog-post-code/tree/master/smote"&gt;&lt;em&gt;Kite Blog repository&lt;/em&gt;&lt;/a&gt;&lt;em&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;For the initial task, I’ll fit a &lt;a href="https://en.wikipedia.org/wiki/Support-vector_machine"&gt;&lt;strong&gt;support-vector machine&lt;/strong&gt;&lt;/a&gt; (SVM) model using a created, perfectly balanced dataset. I chose this kind of model because of how easy it is to visualize and understand its decision boundary, namely, the hyperplane that separates one class from the other.&lt;/p&gt;

&lt;p&gt;To generate a balanced dataset, I’ll use scikit-learn’s &lt;a href="https://kite.com/python/docs/sklearn.datasets.make_classification"&gt;make_classification&lt;/a&gt; function which creates n clusters of normally distributed points suitable for a classification problem. &lt;/p&gt;

&lt;p&gt;My fake dataset consists of 700 sample points, two features, and two classes. To make sure each class is one blob of data, I’ll set the parameter &lt;code&gt;n_clusters_per_class&lt;/code&gt; to 1. &lt;/p&gt;

&lt;p&gt;To simplify it, I’ll remove the redundant features and set the number of informative features to 2. Lastly, I’ll &lt;code&gt;useflip_y=0.06&lt;/code&gt; to reduce the amount of noise. &lt;/p&gt;

&lt;p&gt;The following piece of code shows how we can create our fake dataset and plot it using Python’s Matplotlib.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;matplotlib.pyplot&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;sklearn.datasets&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;make_classification&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;imblearn.datasets&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;make_imbalance&lt;/span&gt;

&lt;span class="c1"&gt;# for reproducibility purposes
&lt;/span&gt;&lt;span class="n"&gt;seed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;

&lt;span class="c1"&gt;# create balanced dataset
&lt;/span&gt;&lt;span class="n"&gt;X1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Y1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;make_classification&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_samples&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;700&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n_features&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n_redundant&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                            &lt;span class="n"&gt;n_informative&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n_clusters_per_class&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                            &lt;span class="n"&gt;class_sep&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;flip_y&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.06&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;random_state&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;seed&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'Balanced dataset'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'x'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'y'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;scatter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X1&lt;/span&gt;&lt;span class="p"&gt;[:,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;X1&lt;/span&gt;&lt;span class="p"&gt;[:,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;marker&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'o'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Y1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
           &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;edgecolor&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'k'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cmap&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;coolwarm&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="c1"&gt;# concatenate the features and labels into one dataframe
&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;concat&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Y1&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt; &lt;span class="n"&gt;axis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'feature_1'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'feature_2'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'label'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="c1"&gt;# save the dataset because we'll use it later
&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'df_base.csv'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;encoding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'utf-8'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--iDLtYvW7--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://kite.com/wp-content/uploads/2019/08/balanced-dataset.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--iDLtYvW7--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://kite.com/wp-content/uploads/2019/08/balanced-dataset.jpg" alt="a perfectly balanced dataset with 1:1 ratio"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A balanced dataset&lt;/p&gt;

&lt;p&gt;As you can see in the previous image, our balanced dataset looks tidy and well defined. So, if we fit an SVM model with this data (code below), how will the decision boundary look?&lt;/p&gt;

&lt;p&gt;Since we’ll be training several models and visualizing their hyperplanes, I wrote two functions that will be reused several times throughout the tutorial. The first one, &lt;code&gt;train_SVM&lt;/code&gt;, is for fitting the SVM model, and it takes the dataset as a parameter. &lt;/p&gt;

&lt;p&gt;The second function, &lt;code&gt;plot_svm_boundary&lt;/code&gt;, plots the decision boundary of the SVM model. Its parameters also include the dataset and the caption of the plot. &lt;/p&gt;

&lt;p&gt;These are the functions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;matplotlib.pyplot&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;sklearn.svm&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SVC&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;train_SVM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
   &lt;span class="c1"&gt;# select the feature columns
&lt;/span&gt;   &lt;span class="n"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="p"&gt;[:,&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s"&gt;'label'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
   &lt;span class="c1"&gt;# select the label column
&lt;/span&gt;   &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;label&lt;/span&gt;

   &lt;span class="c1"&gt;# train an SVM with linear kernel
&lt;/span&gt;   &lt;span class="n"&gt;clf&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SVC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;kernel&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'linear'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
   &lt;span class="n"&gt;clf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

   &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;clf&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;plot_svm_boundary&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;clf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
   &lt;span class="n"&gt;fig&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ax&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subplots&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
   &lt;span class="n"&gt;X0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;X1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;iloc&lt;/span&gt;&lt;span class="p"&gt;[:,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;iloc&lt;/span&gt;&lt;span class="p"&gt;[:,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

   &lt;span class="n"&gt;x_min&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x_max&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;X0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;min&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;X0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
   &lt;span class="n"&gt;y_min&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_max&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;X1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;min&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;X1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
   &lt;span class="n"&gt;xx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;yy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;meshgrid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;arange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x_min&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x_max&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.02&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;arange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y_min&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_max&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.02&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

   &lt;span class="n"&gt;Z&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;clf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;c_&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;xx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ravel&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;yy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ravel&lt;/span&gt;&lt;span class="p"&gt;()])&lt;/span&gt;
   &lt;span class="n"&gt;Z&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reshape&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;xx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
   &lt;span class="n"&gt;out&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ax&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;contourf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;xx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;yy&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Z&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cmap&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;coolwarm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.8&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

   &lt;span class="n"&gt;ax&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;scatter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;X1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cmap&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;coolwarm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;edgecolors&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'k'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
   &lt;span class="n"&gt;ax&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'y'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
   &lt;span class="n"&gt;ax&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'x'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
   &lt;span class="n"&gt;ax&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
   &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;To fit and plot the model, do the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'df_base.csv'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;encoding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'utf-8'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'python'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;clf&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;train_SVM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plot_svm_boundary&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;clf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'Decision Boundary of SVM trained with a balanced dataset'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Y_vEks3M--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://kite.com/wp-content/uploads/2019/08/SVM-Trained-balanced-dataset.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Y_vEks3M--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://kite.com/wp-content/uploads/2019/08/SVM-Trained-balanced-dataset.jpg" alt="decision boundary of an SVM model trained with a balanced dataset"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Blue dots on the blue side and red dots on the red side means that the model was able to find a function that separates the classes&lt;/p&gt;

&lt;p&gt;The image above presents the hyperplane of the base model. On it, we can observe how clear the separation between our classes is. However, what would happen if we imbalance our dataset? How would the decision boundary look? Before doing so, let’s imbalance the dataset by calling the function &lt;a href="https://imbalanced-learn.readthedocs.io/en/stable/generated/imblearn.datasets.make_imbalance.html"&gt;make_imbalance&lt;/a&gt; from the package, &lt;strong&gt;imbalanced-learn&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;... &lt;a href="https://kite.com/blog/python/smote-python-imbalanced-learn-for-oversampling/"&gt;continue with imbalanced-learn&lt;/a&gt; on the Kite blog!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://juandes.com/about-me/"&gt;Juan De Dios Santos&lt;/a&gt; is a travelling &lt;em&gt;Data Storyteller&lt;/em&gt; and Machine Learning professional working on the &lt;em&gt;Wander Data&lt;/em&gt; project.&lt;/p&gt;

</description>
      <category>datascience</category>
      <category>tutorial</category>
      <category>python</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Time Series Analysis with Pandas</title>
      <dc:creator>Aaron Harris</dc:creator>
      <pubDate>Thu, 26 Sep 2019 18:16:58 +0000</pubDate>
      <link>https://dev.to/kite/time-series-analysis-with-pandas-3472</link>
      <guid>https://dev.to/kite/time-series-analysis-with-pandas-3472</guid>
      <description>&lt;p&gt;We’ll be analyzing stock data with Python 3, pandas and Matplotlib. To fully benefit from this article, you should be familiar with the basics of pandas as well as the plotting library called Matplotlib.&lt;/p&gt;

&lt;h2&gt;
  
  
  Time series data
&lt;/h2&gt;

&lt;p&gt;Time series data is a sequence of data points in chronological order that is used by businesses to analyze past data and make future predictions. These data points are a set of observations at specified times and equal intervals, typically with a datetime index and corresponding value. Common examples of time series data in our day-to-day lives include:      &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Measuring weather temperatures &lt;/li&gt;
&lt;li&gt;Measuring the number of taxi rides per month&lt;/li&gt;
&lt;li&gt;Predicting a company’s stock prices for the next day&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Variations of time series data
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Trend Variation:&lt;/strong&gt; moves up or down in a reasonably predictable pattern over a long period of time.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Seasonality Variation:&lt;/strong&gt; regular and periodic; repeats itself over a specific period, such as a day, week, month, season, etc.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cyclical Variation&lt;/strong&gt;: corresponds with business or economic ‘boom-bust’ cycles, or is cyclical in some other form&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Random Variation&lt;/strong&gt;: erratic or residual; doesn’t fall under any of the above three classifications.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here are the four variations of time series data visualized:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--R6yzN7XX--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://kite.com/wp-content/uploads/2019/08/variations-of-time-series.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--R6yzN7XX--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://kite.com/wp-content/uploads/2019/08/variations-of-time-series.jpg" alt="img"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Importing stock data and necessary Python libraries
&lt;/h2&gt;

&lt;p&gt;To demonstrate the use of pandas for stock analysis, we will be using Amazon stock prices from 2013 to 2018. We’re pulling the data from Quandl, a company offering a Python API for sourcing a la carte market data. A CSV file of the data in this article can be downloaded from the article’s repository.      &lt;/p&gt;

&lt;p&gt;Fire up the editor of your choice and type in the following code to import the libraries and data that correspond to this article. &lt;/p&gt;

&lt;p&gt;&lt;em&gt;Example code for this article may be found at the&lt;/em&gt; &lt;a href="https://github.com/kiteco/kite-python-blog-post-code/tree/master/pandas-time-series-analysis"&gt;&lt;em&gt;Kite Blog repository&lt;/em&gt;&lt;/a&gt; &lt;em&gt;on Github.&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Importing required modules
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;matplotlib.pyplot&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;

&lt;span class="c1"&gt;# Settings for pretty nice plots
&lt;/span&gt;&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;style&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'fivethirtyeight'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Reading in the data
&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'amazon_stock.csv'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h3&gt;
  
  
  A first look at Amazon’s stock Prices
&lt;/h3&gt;

&lt;p&gt;Let’s look at the first few columns of the dataset:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Inspecting the data
&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;head&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--BTWDcR0c--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://kite.com/wp-content/uploads/2019/08/amazo-stock-price.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--BTWDcR0c--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://kite.com/wp-content/uploads/2019/08/amazo-stock-price.jpg" alt="img"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let’s get rid of the first two columns as they don’t add any value to the dataset.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;drop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'None'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'ticker'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;inplace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;head&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--P1s5-Qlz--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://kite.com/wp-content/uploads/2019/08/datatypes.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--P1s5-Qlz--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://kite.com/wp-content/uploads/2019/08/datatypes.jpg" alt="img"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let us now look at the datatypes of the various components.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--JWc8JRpW--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://kite.com/wp-content/uploads/2019/08/datatypes2.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--JWc8JRpW--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://kite.com/wp-content/uploads/2019/08/datatypes2.jpg" alt="img"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It appears that the Date column is being treated as a string rather than as dates. To fix this, we’ll use the pandas &lt;code&gt;to_datetime()&lt;/code&gt; feature which converts the arguments to dates.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Convert string to datetime64
&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'Date'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'Date'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nb"&gt;apply&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to_datetime&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Lastly, we want to make sure that the Date column is the index column.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_index&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'Date'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;inplace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;head&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--DG5Kk9Ui--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://kite.com/wp-content/uploads/2019/08/converted-data.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--DG5Kk9Ui--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://kite.com/wp-content/uploads/2019/08/converted-data.jpg" alt="img"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now that our data has been converted into the desired format, let’s take a look at its columns for further analysis.      &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;Open&lt;/strong&gt; and &lt;strong&gt;Close&lt;/strong&gt; columns indicate the opening and closing price of the stocks on a particular day.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;High&lt;/strong&gt; and &lt;strong&gt;Low&lt;/strong&gt; columns provide the highest and the lowest price for the stock on a particular day, respectively.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;Volume&lt;/strong&gt; column tells us the total volume of stocks traded on a particular day.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;code&gt;Adj_Close&lt;/code&gt; column represents the adjusted closing price, or the stock’s closing price on any given day of trading, amended to include any distributions and/or corporate actions occurring any time before the next day’s open. The adjusted closing price is often used when examining or performing a detailed analysis of historical returns.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'Adj_Close'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;figsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'Adjusted Closing Price'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--4hW6uyBO--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://kite.com/wp-content/uploads/2019/08/price-chart.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--4hW6uyBO--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://kite.com/wp-content/uploads/2019/08/price-chart.jpg" alt="img"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Interestingly, it appears that Amazon had a more or less steady increase in its stock price over the 2013-2018 window. We’ll now use pandas to analyze and manipulate this data to gain insights.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pandas for time series analysis
&lt;/h2&gt;

&lt;p&gt;As pandas was developed in the context of financial modeling, it contains a comprehensive set of tools for working with dates, times, and time-indexed data. Let’s look at the main pandas data structures for working with time series data.&lt;/p&gt;

&lt;h3&gt;
  
  
  Manipulating &lt;code&gt;datetime&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Python’s basic tools for working with dates and times reside in the built-in &lt;code&gt;datetime&lt;/code&gt; module. In pandas, a single point in time is represented as a &lt;code&gt;pandas.Timestamp&lt;/code&gt; and we can use the &lt;code&gt;datetime()&lt;/code&gt;function to create &lt;code&gt;datetime&lt;/code&gt; objects  from strings in a wide variety of date/time formats. datetimes are interchangeable with &lt;code&gt;pandas.Timestamp&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;
&lt;span class="n"&gt;my_year&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2019&lt;/span&gt;
&lt;span class="n"&gt;my_month&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;
&lt;span class="n"&gt;my_day&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;21&lt;/span&gt;
&lt;span class="n"&gt;my_hour&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;
&lt;span class="n"&gt;my_minute&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;
&lt;span class="n"&gt;my_second&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;We can now create a &lt;code&gt;datetime&lt;/code&gt; object, and use it freely with pandas given the above attributes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;test_date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;my_year&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;my_month&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;my_day&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;test_date&lt;/span&gt;


&lt;span class="c1"&gt;# datetime.datetime(2019, 4, 21, 0, 0)
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;For the purposes of analyzing our particular data, we have selected only the day, month and year, but we could also include more details like hour, minute and second if necessary.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;test_date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;my_year&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;my_month&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;my_day&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;my_hour&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;my_minute&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;my_second&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'The day is : '&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;test_date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;day&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'The hour is : '&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;test_date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hour&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'The month is : '&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;test_date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;month&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Output
&lt;/span&gt;
&lt;span class="n"&gt;The&lt;/span&gt; &lt;span class="n"&gt;day&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mi"&gt;21&lt;/span&gt;
&lt;span class="n"&gt;The&lt;/span&gt; &lt;span class="n"&gt;hour&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mi"&gt;10&lt;/span&gt;
&lt;span class="n"&gt;The&lt;/span&gt; &lt;span class="n"&gt;month&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mi"&gt;4&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;For our stock price dataset, the type of the index column is &lt;code&gt;DatetimeIndex&lt;/code&gt;. We can use pandas to obtain the minimum and maximum dates in the data.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;min&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

&lt;span class="c1"&gt;# Output
&lt;/span&gt;
&lt;span class="mi"&gt;2018&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;03&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;27&lt;/span&gt; &lt;span class="mi"&gt;00&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;00&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;00&lt;/span&gt;
&lt;span class="mi"&gt;2013&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;01&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;02&lt;/span&gt; &lt;span class="mi"&gt;00&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;00&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;00&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;We can also calculate the latest date location and the earliest date index location as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Earliest date index location
&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;argmin&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="c1"&gt;#Output
&lt;/span&gt;&lt;span class="mi"&gt;1315&lt;/span&gt;
&lt;span class="c1"&gt;# Latest date location
&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;argmax&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="c1"&gt;#Output
&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h3&gt;
  
  
  Time resampling
&lt;/h3&gt;

&lt;p&gt;Examining stock price data for every single day isn’t of much use to financial institutions, who are more interested in spotting market trends. To make it easier, we use a process called time resampling to aggregate data into a defined time period, such as by month or by quarter. Institutions can then see an overview of stock prices and make decisions according to these trends.           &lt;/p&gt;

&lt;p&gt;The pandas library has a &lt;code&gt;resample()&lt;/code&gt; function which resamples such time series data. The resample method in pandas is similar to its &lt;code&gt;groupby&lt;/code&gt; method as it is essentially grouping according to a certain time span. The &lt;code&gt;resample()&lt;/code&gt; function looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;resample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rule&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;'A'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;To summarize:     &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;data.resample()&lt;/code&gt; is used to resample the stock data.&lt;/li&gt;
&lt;li&gt;The ‘A’ stands for year-end frequency, and denotes the offset values by which we want to resample the data.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;mean()&lt;/code&gt; indicates that we want the average stock price during this period.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The output looks like this, with average stock data displayed for December 31st of each year   &lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--2HCDFZD5--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://kite.com/wp-content/uploads/2019/08/stock-data.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--2HCDFZD5--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://kite.com/wp-content/uploads/2019/08/stock-data.jpg" alt="img"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Below is a complete list of the offset values. The list can also be found in the pandas documentation.&lt;/p&gt;

&lt;p&gt;&lt;a href="http://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--tqnQka1o--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://kite.com/wp-content/uploads/2019/08/alias.jpg" alt="img"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Offset aliases for time resampling&lt;/p&gt;

&lt;p&gt;We can also use time sampling to plot charts for specific columns.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'Adj_Close'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;resample&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'A'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;kind&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'bar'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;figsize&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'Yearly Mean Adj Close Price for Amazon'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--lVlOo0sh--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://kite.com/wp-content/uploads/2019/08/amazon-price.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--lVlOo0sh--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://kite.com/wp-content/uploads/2019/08/amazon-price.jpg" alt="img"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The above bar plot corresponds to Amazon’s average adjusted closing price at year-end for each year in our data set. &lt;/p&gt;

&lt;p&gt;Similarly, monthly maximum opening price for each year can be found below. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--w-at3jmH--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://kite.com/wp-content/uploads/2019/08/yearly-price-for-amazon.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--w-at3jmH--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://kite.com/wp-content/uploads/2019/08/yearly-price-for-amazon.jpg" alt="img"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Monthly maximum opening price for Amazon&lt;/p&gt;

&lt;h2&gt;
  
  
  Time shifting
&lt;/h2&gt;

&lt;p&gt;Sometimes, we may need to shift or move the data forward or backwards in time. This shifting is done along a time index by the desired number of time-frequency increments.&lt;/p&gt;

&lt;p&gt;...&lt;a href="https://kite.com/blog/python/pandas-time-series-analysis/"&gt;continue with Time Shifting&lt;/a&gt; and see the code in the Kite Github repo.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;a href="https://towardsdatascience.com/@parulnith"&gt;Parul Pandey&lt;/a&gt; is a Data Science Evangelist at H2O.ai&lt;/em&gt; and author for the Kite Blog.&lt;/p&gt;

</description>
      <category>datascience</category>
      <category>python</category>
      <category>pandas</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Guide: Type Hinting in Python</title>
      <dc:creator>Aaron Harris</dc:creator>
      <pubDate>Mon, 23 Sep 2019 18:08:23 +0000</pubDate>
      <link>https://dev.to/kite/guide-type-hinting-in-python-1p1l</link>
      <guid>https://dev.to/kite/guide-type-hinting-in-python-1p1l</guid>
      <description>&lt;h1&gt;
  
  
  Guide: Type Hinting in Python
&lt;/h1&gt;

&lt;p&gt;Since version 3.5, Python supports type hints: code annotations that, through additional tooling, can check if you’re using your code correctly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;With the release of version 3.5, Python has introduced type hints: code annotations that, through additional tooling, can check if you’re using your code correctly.&lt;/p&gt;

&lt;p&gt;Long-time Python users might cringe at the thought of new code needing type hinting to work properly, but we need not worry: Guido himself wrote in PEP 484, “no type checking happens at runtime.”&lt;/p&gt;

&lt;p&gt;The feature has been proposed mainly to open up Python code for easier static analysis and refactoring.&lt;/p&gt;

&lt;p&gt;For data science–and for the data scientist– type hinting is invaluable for a couple of reasons:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;It makes it much easier to understand the code, just by looking at the signature, i.e. the first line(s) of the function definition;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;It creates a documentation layer that can be checked with a type checker, i.e. if you change the implementation, but forget to change the types, the type checker will (hopefully) yell at you.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Of course, as is always the case with documentation and testing, it’s an investment: it costs you more time at the beginning, but saves you (and your co-worker) a lot in the long run.&lt;/p&gt;

&lt;p&gt;Note: Type hinting has also been ported to Python 2.7 (a.k.a Legacy Python). The functionality, however, requires comments to work. Furthermore, no one should be using Legacy Python in 2019: it’s less beautiful and only has a couple more months of updates before it stops receiving support of any kind.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting started with types
&lt;/h2&gt;

&lt;p&gt;The code for this article may be found at &lt;a href="https://github.com/kiteco/kite-python-blog-post-code/tree/master/python-typing"&gt;Kite’s Github repository&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The hello world of type hinting is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# hello_world.py
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;hello_world&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;'Joe'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;f'Hello &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;We have added two type hint elements here. The first one is &lt;code&gt;: str&lt;/code&gt; after name and the second one is &lt;code&gt;-&amp;gt; str&lt;/code&gt; towards the end of the signature.&lt;/p&gt;

&lt;p&gt;The syntax works as you would expect: we’re marking name to be of type &lt;code&gt;str&lt;/code&gt; and we’re specifying that the &lt;code&gt;hello_world&lt;/code&gt; function should output a &lt;code&gt;str&lt;/code&gt;. If we use our function, it does what it says:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;hello_world&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'Mark'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="s"&gt;'Hello Mark'&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Since Python remains a dynamically unchecked language, we can still shoot ourselves in the foot:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;hello_world&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="s"&gt;'Hello 2'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;What’s happening? Well, as I wrote in the introduction, no type checking happens at runtime. &lt;/p&gt;

&lt;p&gt;So as long as the code doesn’t raise an exception, things will continue to work fine.&lt;/p&gt;

&lt;p&gt;What should you do with these type definitions then? Well, you need a type checker, or an IDE that reads and checks the types in your code (PyCharm, for example).&lt;/p&gt;

&lt;p&gt;Type checking your program&lt;br&gt;
There are at least four major type checker implementations: &lt;a href="http://mypy-lang.org/"&gt;Mypy&lt;/a&gt;, &lt;a href="https://github.com/Microsoft/pyright"&gt;Pyright&lt;/a&gt;, &lt;a href="https://pyre-check.org/"&gt;pyre&lt;/a&gt;, and &lt;a href="https://github.com/google/pytype"&gt;pytype&lt;/a&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Mypy&lt;/em&gt; is actively developed by, among others, Guido van Rossum, Python’s creator. &lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Pyright&lt;/em&gt; has been developed by Microsoft and integrates very well with their excellent Visual Studio Code;&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Pyre&lt;/em&gt; has been developed by Facebook with the goal to be fast (even though mypy recently got much faster);&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Pytype&lt;/em&gt; has been developed by Google and, besides checking the types as the others do, it can run type checks (and add annotations) on unannotated code.
Since we want to focus on how to use typing from a Python perspective, we’ll use Mypy in this tutorial. We can install it using &lt;code&gt;pip&lt;/code&gt; (or your package manager of choice):
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ pip install mypy
$ mypy hello_world.py 

&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;More advanced types&lt;br&gt;
In principle, all Python classes are valid types, meaning you can use &lt;code&gt;str&lt;/code&gt;, &lt;code&gt;int&lt;/code&gt;, &lt;code&gt;float&lt;/code&gt;, etc. Using dictionary, tuples, and similar is also possible, but you need to import them from the typing module.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# tree.py
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Tuple&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Iterable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;DefaultDict&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;collections&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_tree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tuples&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Iterable&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Tuple&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;DefaultDict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;]]:&lt;/span&gt;
    &lt;span class="s"&gt;"""
    Return a tree given tuples of (child, father)

    The tree structure is as follows:

        tree = {node_1: [node_2, node_3], 
                node_2: [node_4, node_5, node_6],
                node_6: [node_7, node_8]}
    """&lt;/span&gt;
    &lt;span class="n"&gt;tree&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;child&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;father&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;tuples&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;father&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;father&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;child&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;tree&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;create_tree&lt;/span&gt;&lt;span class="p"&gt;([(&lt;/span&gt;&lt;span class="mf"&gt;2.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;3.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;4.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mf"&gt;3.0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mf"&gt;6.0&lt;/span&gt;&lt;span class="p"&gt;)]))&lt;/span&gt;
&lt;span class="c1"&gt;# will print
# defaultdict( 'list'=""&amp;gt;, {1.0: [2.0, 3.0], 3.0: [4.0], 6.0: [1.0]}
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;While the code is simple, it introduces a couple of extra elements:&lt;/p&gt;

&lt;p&gt;First of all, the &lt;code&gt;Iterable&lt;/code&gt; type for the &lt;code&gt;tuples&lt;/code&gt; variable. This type indicates that the object should conform to the &lt;code&gt;collections.abc.Iterable&lt;/code&gt; specification (i.e. implement &lt;code&gt;__iter__&lt;/code&gt;). This is needed because we iterate over tuples in the for loop;&lt;br&gt;
We specify the types inside our container objects: the Iterable contains Tuple, the Tuples are composed of pairs of &lt;code&gt;int&lt;/code&gt;, and so on.&lt;/p&gt;

&lt;p&gt;Ok, let’s try to type check it!&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ mypy tree.py
tree.py:14: error: Need type annotation for 'tree'

&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Uh-oh, what’s happening? Basically Mypy is complaining about this line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;tree&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;While we know that the return type should be &lt;code&gt;DefaultDict[int, List[int]]&lt;/code&gt;, Mypy cannot infer that tree is indeed of that type. We need to help it out by specifying tree’s type. Doing so can be done similarly to how we do it in the signature:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;DefaultDict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;If we now re-run Mypy again, all is well:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ mypy tree.py
$
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h2&gt;
  
  
  Type aliases
&lt;/h2&gt;

&lt;p&gt;Sometimes our code reuses the same composite types over and over again. In the above example, Tuple[int, int] might be such a case. To make our intent clearer (and shorten our code), we can use type aliases. Type aliases are very easy to use: we just assign a type to a variable, and use that variable as the new type...&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;Relation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Tuple&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_tree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tuples&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Iterable&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Relation&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;DefaultDict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;]]:&lt;/span&gt;
    &lt;span class="s"&gt;"""
    Return a tree given tuples of (child, father)

    The tree structure is as follow:

        tree = {node_1: [node_2, node_3], 
                node_2: [node_4, node_5, node_6],
                node_6: [node_7, node_8]}
    """&lt;/span&gt;
    &lt;span class="c1"&gt;# convert to dict
&lt;/span&gt;    &lt;span class="n"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;DefaultDict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;child&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;father&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;tuples&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;father&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;father&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;child&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;tree&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h2&gt;
  
  
  Generics
&lt;/h2&gt;

&lt;p&gt;Experienced programmers of statically typed languages might have noticed that defining a Relation as a tuple of integers is a bit restricting. Can’t &lt;code&gt;create_tree&lt;/code&gt; work with a float, or a string, or the ad-hoc class that we just created?&lt;/p&gt;

&lt;p&gt;In principle, there’s nothing that prevents us from using it like that:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# tree.py
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Tuple&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Iterable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;DefaultDict&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;collections&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;defaultdict&lt;/span&gt;

&lt;span class="n"&gt;Relation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Tuple&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_tree&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tuples&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Iterable&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Relation&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;DefaultDict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;]]:&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;create_tree&lt;/span&gt;&lt;span class="p"&gt;([(&lt;/span&gt;&lt;span class="mf"&gt;2.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;3.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;4.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mf"&gt;3.0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mf"&gt;6.0&lt;/span&gt;&lt;span class="p"&gt;)]))&lt;/span&gt;
&lt;span class="c1"&gt;# will print
# defaultdict( 'list'=""&amp;gt;, {1.0: [2.0, 3.0], 3.0: [4.0], 6.0: [1.0]})
&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;However if we ask Mypy’s opinion of the code, we’ll get an error:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;mypy tree.py
tree.py:24: error: List item 0 has incompatible &lt;span class="nb"&gt;type&lt;/span&gt; &lt;span class="s1"&gt;'Tuple[float, float]'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; expected &lt;span class="s1"&gt;'Tuple[int, int]'&lt;/span&gt;
...
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;There is a way in Python to fix this. It’s called TypeVar, and it works by creating a generic type that doesn’t require assumptions: it just fixes it throughout our module. Usage is pretty simple...&lt;/p&gt;

&lt;p&gt;Check out the code -- &lt;a href="https://kite.com/blog/python/type-hinting/"&gt;Guide: Type Hinting in Python 3.5&lt;br&gt;
&lt;/a&gt; on Kite's blog.&lt;br&gt;
&lt;em&gt;Giovanni Lanzani&lt;/em&gt; is the Director of Learning and Development at GoDataDriven. &lt;/p&gt;

</description>
      <category>datascience</category>
      <category>python</category>
      <category>tutorial</category>
      <category>codequality</category>
    </item>
    <item>
      <title>Statistical Modeling with Python: How-to &amp; Top Libraries</title>
      <dc:creator>Aaron Harris</dc:creator>
      <pubDate>Thu, 12 Sep 2019 03:51:43 +0000</pubDate>
      <link>https://dev.to/kite/statistical-modeling-with-python-how-to-top-libraries-lin</link>
      <guid>https://dev.to/kite/statistical-modeling-with-python-how-to-top-libraries-lin</guid>
      <description>&lt;h2&gt;
  
  
  Statistical Modeling with Python: How-to &amp;amp; Top Libraries
&lt;/h2&gt;

&lt;p&gt;This post covers some of the essential statistical modeling frameworks and methods for Python, which can help us do statistical modeling and probabilistic computation.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Introduction: Why Python for data science&lt;/li&gt;
&lt;li&gt;Why these frameworks are necessary&lt;/li&gt;
&lt;li&gt;Start with NumPy&lt;/li&gt;
&lt;li&gt;Matplotlib and Seaborn for visualization&lt;/li&gt;
&lt;li&gt;Using Seaborn and Matplotlib&lt;/li&gt;
&lt;li&gt;SciPy for inferential statistics&lt;/li&gt;
&lt;li&gt;Statsmodels for advanced modeling&lt;/li&gt;
&lt;li&gt;Scikit-learn for statistical learning&lt;/li&gt;
&lt;li&gt;Conclusion&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why these frameworks are necessary
&lt;/h2&gt;

&lt;p&gt;While Python is most popular for data wrangling, visualization, general machine learning, deep learning and associated linear algebra (tensor and matrix operations), and web integration, its statistical modeling abilities are far less advertised. A large percentage of data scientists still use other special statistical languages such as R, MATLAB, or SAS over Python for their modeling and analysis.&lt;/p&gt;

&lt;p&gt;While each of these alternatives offer their own unique blend of features and power for statistical analyses, it’s useful for an up-and-coming data scientist to know more about various Python frameworks and methods that can be used for routine operations of descriptive and inferential statistics.&lt;/p&gt;

&lt;p&gt;The biggest motivation for learning about these frameworks is that statistical inference and probabilistic modeling represent the bread and butter of a data scientists’ daily work. However, only by using such Python-based tools can a powerful end-to-end data science pipeline (a complete flow extending from data acquisition to final business decision generation) be built using a single programming language.&lt;/p&gt;

&lt;p&gt;If using different statistical languages for various tasks, you may face some problems. For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Conducting any web scraping and database access using SQL commands and Python libraries such as BeautifulSoup and SQLalchemy&lt;/li&gt;
&lt;li&gt;Cleaning up and preparing your data tables using Pandas, but then switching to R or SPSS for performing statistical tests and computing confidence intervals&lt;/li&gt;
&lt;li&gt;Using ggplot2 for creating visualization, and then using a standalone LaTeX editor to type up the final analytics report&lt;/li&gt;
&lt;li&gt;Switching between multiple programmatic frameworks makes the process cumbersome and error-prone.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What if you could do statistical modeling, analysis, and visualization all inside a core Python platform? Let’s see what frameworks and methods exist for accomplishing such tasks.&lt;/p&gt;

&lt;h2&gt;
  
  
  Start with NumPy
&lt;/h2&gt;

&lt;p&gt;NumPy is the de-facto standard for numerical computation in Python, used as the base for building more advanced libraries for data science and machine learning applications such as TensorFlow or Scikit-learn. For numeric processing, NumPy is much faster than native Python code due to the vectorized implementation of its methods and the fact that many of its core routines are written in C (based on the CPython framework).&lt;/p&gt;

&lt;p&gt;Although the majority of NumPy related discussions are focused on its linear algebra routines, it offers a decent set of statistical modeling functions for performing basic descriptive statistics and generating random variables based on various discrete and continuous distributions.&lt;/p&gt;

&lt;p&gt;For example, let’s create a NumPy array from a simple Python list and compute basic descriptive statistics like mean, median, standard deviation, quantiles, etc.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The code for this article may be found at &lt;a href="https://github.com/kiteco/kite-python-blog-post-code/tree/master/statistical-modeling"&gt;Kite’s Github repository&lt;/a&gt;.&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;

&lt;span class="c1"&gt;# Define a python list
&lt;/span&gt;&lt;span class="n"&gt;a_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;5.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;3.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;6.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;7.5&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# Convert the list into numpy array
&lt;/span&gt;&lt;span class="n"&gt;an_array&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;array&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a_list&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Compute and print various statistics
&lt;/span&gt;&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'Mean:'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;an_array&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'Median:'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;median&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;an_array&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'Range (Max - min):'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ptp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;an_array&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'Standard deviation:'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;an_array&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'80th percentile:'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;percentile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;an_array&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'0.2-quantile:'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;quantile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;an_array&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;The results are as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Mean: 3.5
Median: 4.0
Range (Max - min): 9.5
Standard deviation: 2.9068883707497264
80th percentile: 5.699999999999999
0.2-quantile: 1.4000000000000001
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;&lt;em&gt;To read more about Numpy, Matplotlib, Seaborn, and Statsmodels, check out the &lt;a href="https://kite.com/blog/python/statistical-modeling-python-libraries/"&gt;full article by Tirtha Sarkar&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Tirtha Sarkar is a semiconductor technologist, data science author, and author of pydbgen, MLR, and doepy packages. He holds a Ph.D. in Electrical Engineering and M.S. in Data Analytics.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>datascience</category>
      <category>statistics</category>
    </item>
    <item>
      <title>Data Science, the Good, the Bad, and the... Future</title>
      <dc:creator>Aaron Harris</dc:creator>
      <pubDate>Thu, 01 Aug 2019 19:08:51 +0000</pubDate>
      <link>https://dev.to/kite/data-science-the-good-the-bad-and-the-future-jf8</link>
      <guid>https://dev.to/kite/data-science-the-good-the-bad-and-the-future-jf8</guid>
      <description>&lt;h2&gt;
  
  
  Positives: astrophysics, biology, and sports
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Part of a new article by Kirit Thadaka on the &lt;a href="https://kite.com/blog/python/future-of-data-science/"&gt;Kite Blog&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Data science made a huge positive impact on the way technology influences our lives. Some of these impacts have been nice and some have been otherwise. &lt;em&gt;looks at Facebook&lt;/em&gt; But, technology can’t inherently be good or bad, technology is… technology. It’s the way we use it that has good or bad outcomes.&lt;/p&gt;

&lt;p&gt;We recently had a breakthrough in astrophysics with the first ever picture of a black hole. This helps physicists confirm more than a century of purely theoretical work around black holes and the theory of relativity.&lt;/p&gt;

&lt;p&gt;To capture this image, scientists used a telescope as big as the earth (Event Horizon Telescope or EHT) by combining data from an array of eight ground-based radio telescopes and making sense of it all to construct an image. Analyzing data and then visualizing that data – sounds like some data science right here.&lt;/p&gt;

&lt;p&gt;A cool side note on this point: a standard Python library of functions for EHT Imaging was developed by Andrew Chael from Harvard to simulate and manipulate VLBI (Very-long-baseline interferometry) data helping the process of creating the black hole image.&lt;/p&gt;

&lt;p&gt;Olivier Elemento at Cornell uses Big Data Analytics to help identity mutations in genomes that result in tumor cells spreading so that they can be killed earlier – this is a huge positive impact data science has on human life. You can read more about his incredible research here.&lt;/p&gt;

&lt;p&gt;Python is used by researchers in his lab while testing statistical and machine learning models. Keras, NumPy, Scipy, and Scikit-learn are some top notch Python libraries for this.&lt;/p&gt;

&lt;p&gt;If you’re a fan of the English Premier League, you’ll appreciate the example of Leicester City winning the title in the 2015-2016 season.&lt;/p&gt;

&lt;p&gt;At the start of the season, bookmakers had the likelihood Leicester City winning the EPL at 10 times less than the odds of finding the Loch Ness monster. For a more detailed attempt at describing the significance of this story, read this.&lt;/p&gt;

&lt;p&gt;Everyone wanted to know how Leicester was able to do this, and it turns out that data science played a big part! Thanks to their investment into analytics and technology, the club was able to measure players’ fitness levels and body condition while they were training to help prevent injuries, all while assessing best tactics to use in a game based on the players’ energy levels.&lt;/p&gt;

&lt;p&gt;All training sessions had plans backed by real data about the players, and as a result Leicester City suffered the least amount of player injuries of all clubs that season.&lt;/p&gt;

&lt;p&gt;Many top teams use data analytics to help with player performance, scouting talent, and understanding how to plan for certain opponents.&lt;/p&gt;

&lt;p&gt;Here’s an example of Python being used to help with some football analysis. I certainly wish Chelsea F.C. would use some of these techniques to improve their woeful form and make my life as a fan better. You don’t need analytics to see that Kante is in the wrong position, and Jorginho shouldn’t be in that team and… Okay I’m digressing – back to the topic now!&lt;/p&gt;

&lt;p&gt;Now that we’ve covered some of the amazing things data science has uncovered, I’m going to touch on some of the negatives as well – it’s important to critically think about technology and how it impacts us.&lt;/p&gt;

&lt;p&gt;The amount that technology impacts our lives will undeniably increase with time, and we shouldn’t limit our understanding without being aware of the positive and negative implications it can have.&lt;/p&gt;

&lt;p&gt;Some of the concerns I have around this ecosystem are data privacy (I’m sure we all have many examples that come to mind), biases in predictions and classifications, and the impact of personalization and advertising on society.&lt;/p&gt;

&lt;h2&gt;
  
  
  Negatives: gender bias and more
&lt;/h2&gt;

&lt;p&gt;This paper published in NIPS talks about how to counter gender biases in word embeddings used frequently in data science.&lt;/p&gt;

&lt;p&gt;For those who aren’t familiar with the term, word embeddings are a clever way of representing words so that neural networks and other computer algorithms can process them.&lt;/p&gt;

&lt;p&gt;The data used to create Word2Vec (a model for word embeddings created by Google) has resulted in gender biases that show close relations between “men” and words like “computer scientist”, “architect”, “captain”, etc. while showing “women” to be closely related to “homemaker”, “nanny”, “nurse”, etc.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--DCQs2CA5--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://thepracticaldev.s3.amazonaws.com/i/cks5zh09ze70wklc5cru.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--DCQs2CA5--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://thepracticaldev.s3.amazonaws.com/i/cks5zh09ze70wklc5cru.png" alt=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here’s the Python code used by the researchers who published this paper. Python’s ease of use makes it a good choice for quickly going from idea to implementation...&lt;/p&gt;

&lt;p&gt;Check out the &lt;a href="https://kite.com/blog/python/future-of-data-science/?fbclid=IwAR0lfJ5Hi5SqrOOxER_eobpeI3E0LofNnknQmABjQh8J_-8UOEn3_ZkKbbk"&gt;code and examples&lt;/a&gt;, and rest of the negatives on our blog!&lt;/p&gt;

</description>
      <category>python</category>
      <category>datascience</category>
      <category>machinelearning</category>
      <category>ai</category>
    </item>
    <item>
      <title>First, use the right tool for the job. Check out the options in this pocket-sized guide.</title>
      <dc:creator>Aaron Harris</dc:creator>
      <pubDate>Fri, 19 Apr 2019 23:13:26 +0000</pubDate>
      <link>https://dev.to/kite/first-use-the-right-tool-for-the-job-check-out-the-options-in-this-pocket-sized-guide-i05</link>
      <guid>https://dev.to/kite/first-use-the-right-tool-for-the-job-check-out-the-options-in-this-pocket-sized-guide-i05</guid>
      <description>&lt;p&gt;So you want to set up a superior Python environment, but you don’t want it to be a major hassle. Well, this was written for you!&lt;/p&gt;

&lt;p&gt;In this post, we explore the top IDEs and general-purpose editors for all your Python programming needs.&lt;/p&gt;

&lt;p&gt;What Are IDEs?&lt;br&gt;
An integrated development environment (IDE) provides Python programmers with a suite of tools that streamline the coding, testing, and debugging process for specific use cases.&lt;/p&gt;

&lt;p&gt;The best Python IDE for you is the one that will help you ship code faster by automating repetitive tasks, organizing information, and helping reduce errors.&lt;/p&gt;

&lt;p&gt;What Is a Code Editor?&lt;br&gt;
Code editors are tools that make writing code easier, offering syntax highlighting and code formatting, among other things. They differ from IDEs in that they have less features outside of their primary use case which is writing code.&lt;/p&gt;

&lt;p&gt;For this reason, they are typically quicker and lighter weight, which leads some developers to preference using a code editor over an IDE.&lt;/p&gt;

&lt;p&gt;Some code editors also deliver additional functionality, like debugging and code execution, however.&lt;/p&gt;

&lt;p&gt;Want to know more? Check out:&lt;br&gt;
&lt;a href="https://owlskip.com/s/top-ides"&gt;https://owlskip.com/s/top-ides&lt;/a&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>editors</category>
      <category>webdev</category>
      <category>datascience</category>
    </item>
  </channel>
</rss>
