<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Paridhi Agarwal</title>
    <description>The latest articles on DEV Community by Paridhi Agarwal (@paridhi).</description>
    <link>https://dev.to/paridhi</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F254235%2F8f3d8378-e6ea-4d8b-865f-3674a1756151.jpg</url>
      <title>DEV Community: Paridhi Agarwal</title>
      <link>https://dev.to/paridhi</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/paridhi"/>
    <language>en</language>
    <item>
      <title>Rough post</title>
      <dc:creator>Paridhi Agarwal</dc:creator>
      <pubDate>Sun, 28 Jan 2024 08:12:09 +0000</pubDate>
      <link>https://dev.to/paridhi/rough-post-522g</link>
      <guid>https://dev.to/paridhi/rough-post-522g</guid>
      <description>&lt;p&gt;rough post&lt;/p&gt;

</description>
    </item>
    <item>
      <title>How to make comical visualizations in Python: Explained using Netflix Movie and TV Show dataset</title>
      <dc:creator>Paridhi Agarwal</dc:creator>
      <pubDate>Tue, 19 Oct 2021 20:18:12 +0000</pubDate>
      <link>https://dev.to/paridhi/how-to-make-comical-visualizations-in-python-explained-using-netflix-movie-and-tv-show-dataset-4418</link>
      <guid>https://dev.to/paridhi/how-to-make-comical-visualizations-in-python-explained-using-netflix-movie-and-tv-show-dataset-4418</guid>
      <description>&lt;p&gt;After you’re done watching a &lt;a href="https://www.youtube.com/watch?v=3sxg1xXmd0I"&gt;brilliant show&lt;/a&gt; or &lt;a href="https://www.youtube.com/watch?v=prwUFBsDRLk&amp;amp;t=10s"&gt;movie&lt;/a&gt; on Netflix, does it ever occur to you just how awesome Netflix is for giving you access to this amazing plethora of content? Surely, I’m not alone in this, am I?&lt;/p&gt;

&lt;p&gt;One thought leads to another, and before you know it, you’ve made up your mind to do an exploratory data analysis to find out more about who the most popular actors are and which country prefers which genre.&lt;/p&gt;

&lt;p&gt;Now, I’ve spent my fair share of time making regular bar plots and pie plots using Python, and while they do a perfect job in conveying the results, I wanted to add a little fun element to this project.&lt;/p&gt;

&lt;p&gt;I recently learned that you can create &lt;a href="https://matplotlib.org/stable/gallery/showcase/xkcd.html"&gt;XKCD-like plots&lt;/a&gt; in Matplotlib, Python’s most popular data viz library, and decided that I should &lt;em&gt;comify&lt;/em&gt; all my plots in this project just to make things a little more interesting.&lt;/p&gt;

&lt;p&gt;Let’s take a look at what the data has to say!&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;The data&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;I used &lt;a href="https://www.kaggle.com/shivamb/netflix-shows"&gt;this dataset&lt;/a&gt;, that’s available on Kaggle. It contains 7,787 movie and TV show titles available on Netflix as of 2020.&lt;/p&gt;

&lt;p&gt;To start off, I installed the required libraries and read the CSV file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import pandas as pd
import matplotlib.pyplot as plt
plt.rcParams['figure.dpi'] = 200

df = pd.read_csv("../input/netflix-shows/netflix_titles.csv")
df.head()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--bbRYNRzq--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/lyuz5q0e5enupbwbxqp3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--bbRYNRzq--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/lyuz5q0e5enupbwbxqp3.png" alt="The raw dataset"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I also added new features to the dataset that will be used later on in the project.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;df["date_added"] = pd.to_datetime(df['date_added'])
df['year_added'] = df['date_added'].dt.year.astype('Int64')
df['month_added'] = df['date_added'].dt.month

df['season_count'] = df.apply(lambda x : x['duration'].split(" ")[0] if "Season" in x['duration'] else "", axis = 1)
df['duration'] = df.apply(lambda x : x['duration'].split(" ")[0] if "Season" not in x['duration'] else "", axis = 1)
df.head()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--sesgIFsq--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/y9lywia58wdj73aqa9d0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--sesgIFsq--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/y9lywia58wdj73aqa9d0.png" alt="Dataset after adding a few other features"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now we can get to the interesting stuff!&lt;/p&gt;

&lt;p&gt;Let me also add that, to XKCDify plots in matplotlib, you just need to engulf all your plotting code within the following block and you’ll be all set:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;with plt.xkcd():
    # all your regular visualization code goes in here
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--G_Q8hmzA--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/eou2ytvnu2w0li1fcbvw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--G_Q8hmzA--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/eou2ytvnu2w0li1fcbvw.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Netflix through the years
&lt;/h3&gt;

&lt;p&gt;First, I thought it would be worth looking at a timeline that depicts the evolution of Netflix over the years.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from datetime import datetime
## these go on the numbers below
tl_dates = [
    "1997\nFounded",
    "1998\nMail Service",
    "2003\nGoes Public",
    "2007\nStreaming service",
    "2016\nGoes Global",
    "2021\nNetflix &amp;amp; Chill"
]
tl_x = [1, 2, 4, 5.3, 8,9]
## the numbers go on these
tl_sub_x = [1.5,3,5,6.5,7]
tl_sub_times = [
    "1998","2000","2006","2010","2012"
]
tl_text = [
    "Netflix.com launched",
    "Starts\nPersonal\nRecommendations","Billionth DVD Delivery","Canadian\nLaunch","UK Launch"]
with plt.xkcd():
# Set figure &amp;amp; Axes
    fig, ax = plt.subplots(figsize=(15, 4), constrained_layout=True)
    ax.set_ylim(-2, 1.75)
    ax.set_xlim(0, 10)
# Timeline : line
    ax.axhline(0, xmin=0.1, xmax=0.9, c='deeppink', zorder=1)
# Timeline : Date Points
    ax.scatter(tl_x, np.zeros(len(tl_x)), s=120, c='palevioletred', zorder=2)
    ax.scatter(tl_x, np.zeros(len(tl_x)), s=30, c='darkmagenta', zorder=3)
    # Timeline : Time Points
    ax.scatter(tl_sub_x, np.zeros(len(tl_sub_x)), s=50, c='darkmagenta',zorder=4)
# Date Text
    for x, date in zip(tl_x, tl_dates):
        ax.text(x, -0.55, date, ha='center', 
                fontfamily='serif', fontweight='bold',
                color='royalblue',fontsize=12)
# Stemplot : vertical line
    levels = np.zeros(len(tl_sub_x))    
    levels[::2] = 0.3
    levels[1::2] = -0.3
    markerline, stemline, baseline = ax.stem(tl_sub_x, levels, use_line_collection=True)    
    plt.setp(baseline, zorder=0)
    plt.setp(markerline, marker=',', color='darkmagenta')
    plt.setp(stemline, color='darkmagenta')
# Text
    for idx, x, time, txt in zip(range(1, len(tl_sub_x)+1), tl_sub_x, tl_sub_times, tl_text):
        ax.text(x, 1.3*(idx%2)-0.5, time, ha='center', 
                fontfamily='serif', fontweight='bold',
                color='royalblue', fontsize=11)
ax.text(x, 1.3*(idx%2)-0.6, txt, va='top', ha='center', 
            fontfamily='serif',color='royalblue')

# Spine
    for spine in ["left", "top", "right", "bottom"]:
        ax.spines[spine].set_visible(False)
# Ticks    
    ax.set_xticks([]) 
    ax.set_yticks([])
# Title
    ax.set_title("Netflix through the years", fontweight="bold", fontfamily='serif', fontsize=16, color='royalblue')
    ax.text(2.4,1.57,"From DVD rentals to a global audience of over 150m people - is it time for Netflix to Chill?", fontfamily='serif', fontsize=12, color='mediumblue')
plt.show()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--vehYkN9G--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/48s7g44k21wm6s536ys8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--vehYkN9G--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/48s7g44k21wm6s536ys8.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This plot paints a pretty decent picture of Netflix’s journey. Also, the plot looks hand-drawn because of the &lt;code&gt;plt.xkcd()&lt;/code&gt; function. Wicked stuff.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Movies vs TV Shows
&lt;/h3&gt;

&lt;p&gt;Next, I decided to take a look at the ratio of Movies vs TV Shows.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;col = "type"
grouped = df[col].value_counts().reset_index()
grouped = grouped.rename(columns = {col : "count", "index" : col})
with plt.xkcd():
    explode = (0, 0.1)  # only "explode" the 2nd slice (i.e. 'TV Show')
fig1, ax1 = plt.subplots(figsize=(5, 5), dpi=100)
    ax1.pie(grouped["count"], explode=explode, labels=grouped["type"], autopct='%1.1f%%',
        shadow=True, startangle=90)
    ax1.axis('equal')  # Equal aspect ratio ensures that pie is drawn as a circle.
plt.show()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--_EYEcT-z--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/33xm9mfhe747u2gwahex.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--_EYEcT-z--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/33xm9mfhe747u2gwahex.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The number of TV shows on the platform is less than a third of the total content. So probably, both you and I have better chances of finding a relatively good movie than a TV Show on Netflix. &lt;em&gt;*sighs*&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Countries with the most content
&lt;/h3&gt;

&lt;p&gt;For my third visualization, I wanted to make a horizontal bar graph that represented the top 25 countries with the most content. The &lt;code&gt;country&lt;/code&gt; column in the dataframe had a few rows that contained more than 1 country (separated by commas).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--oDI945zb--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/emv3gmyjtmb204h13xbj.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--oDI945zb--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/emv3gmyjtmb204h13xbj.jpeg" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To handle this, I split the data in the country column with &lt;code&gt;", “&lt;/code&gt; as the separator and then put all the countries into a list called &lt;code&gt;categories&lt;/code&gt; :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from collections import Counter
col = "country"
categories = ", ".join(df[col].fillna("")).split(", ")
counter_list = Counter(categories).most_common(25)
counter_list = [_ for _ in counter_list if _[0] != ""]
labels = [_[0] for _ in counter_list]
values = [_[1] for _ in counter_list]
with plt.xkcd():
    fig, ax = plt.subplots(figsize=(10, 10), dpi=100)
    y_pos = np.arange(len(labels))
    ax.barh(y_pos, values, align='center')
    ax.set_yticks(y_pos)
    ax.set_yticklabels(labels)
    ax.invert_yaxis()  # labels read top-to-bottom
    ax.set_xlabel('Content')
    ax.set_title('Countries with most content')
plt.show()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--ryf8T8sn--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/9u6njf6ovk5fmkqst3a6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--ryf8T8sn--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/9u6njf6ovk5fmkqst3a6.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Some overall thoughts after looking at the plot above:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The vast majority of content on Netflix is from the United States (quite obvious).&lt;/li&gt;
&lt;li&gt;Even though Netflix launched quite late in India (in 2016), it’s already in the second position right after the US. So, India is a big market for Netflix.&lt;/li&gt;
&lt;li&gt;I’m going to look for content from Thailand on Netflix, now that I know that it’s there on the platform, brb.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  4. Popular directors and actors
&lt;/h3&gt;

&lt;p&gt;To take a look at the popular directors and actors, I decided to plot a figure (each) with six subplots from the top six countries with the most content and make horizontal bar charts for each subplot. Take a look at the plots below and read that first line again. 😛&lt;/p&gt;

&lt;h4&gt;
  
  
  a. Popular directors:
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from collections import Counter
from matplotlib.pyplot import figure
import math
colours = ["orangered", "mediumseagreen", "darkturquoise", "mediumpurple", "deeppink", "indianred"]
countries_list = ["United States", "India", "United Kingdom", "Japan", "France", "Canada"]
col = "director"
with plt.xkcd():
    figure(num=None, figsize=(20, 8)) 
    x=1
    for country in countries_list:
        country_df = df[df["country"]==country]
        categories = ", ".join(country_df[col].fillna("")).split(", ")
        counter_list = Counter(categories).most_common(6)
        counter_list = [_ for _ in counter_list if _[0] != ""]
        labels = [_[0] for _ in counter_list][::-1]
        values = [_[1] for _ in counter_list][::-1]
        if max(values)&amp;lt;10:
            values_int = range(0, math.ceil(max(values))+1)
        else:
            values_int = range(0, math.ceil(max(values))+1, 2)
        plt.subplot(2, 3, x)
        plt.barh(labels,values, color = colours[x-1])
        plt.xticks(values_int)
        plt.title(country)
        x+=1
    plt.suptitle('Popular Directors with the most content')
    plt.tight_layout()
    plt.show()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--cpuQr3fM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/wkgbfri13l0nhhh84u8v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--cpuQr3fM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/wkgbfri13l0nhhh84u8v.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  b. Popular actors:
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;col = "cast"
with plt.xkcd():
    figure(num=None, figsize=(20, 8)) 
    x=1
    for country in countries_list:
        df["from_country"] = df['country'].fillna("").apply(lambda x : 1 if country.lower() in x.lower() else 0)
        small = df[df["from_country"] == 1]
        cast = ", ".join(small['cast'].fillna("")).split(", ")
        tags = Counter(cast).most_common(11)
        tags = [_ for _ in tags if "" != _[0]]
        labels, values = [_[0]+"  " for _ in tags][::-1], [_[1] for _ in tags][::-1]
        if max(values)&amp;lt;10:
            values_int = range(0, math.ceil(max(values))+1)
        elif max(values)&amp;gt;=10 and max(values)&amp;lt;=20:
            values_int = range(0, math.ceil(max(values))+1, 2)
        else:
            values_int = range(0, math.ceil(max(values))+1, 5)
        plt.subplot(2, 3, x)
        plt.barh(labels,values, color = colours[x-1])
        plt.xticks(values_int)
        plt.title(country)
        x+=1
    plt.suptitle('Popular Actors with the most content')
    plt.tight_layout()
    plt.show()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--LSH7qt7q--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/5r4floab1c821ytqnsrx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--LSH7qt7q--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/5r4floab1c821ytqnsrx.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Some of the oldest movies and TV shows
&lt;/h3&gt;

&lt;p&gt;I thought it would  be quite interesting to look at the oldest movies and TV shows that are available on Netflix and how long back they’re dated.&lt;/p&gt;

&lt;h4&gt;
  
  
  a. Oldest movies:
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;small = df.sort_values("release_year", ascending = True)
#small.duration stores empty values if the content type is 'TV Show'
small = small[small['duration'] != ""].reset_index()
small[['title', "release_year"]][:15]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--3jBF7yGi--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/2hdo47l71cfv06rykkfc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--3jBF7yGi--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/2hdo47l71cfv06rykkfc.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  b. Oldest TV shows:
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;small = df.sort_values("release_year", ascending = True)
#small.season_count stores empty values if the content type is 'Movie'
small = small[small['season_count'] != ""].reset_index()
small = small[['title', "release_year"]][:15]
small
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--YfAPQRrY--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/sdnfbqynuk0jlldzyr8t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--YfAPQRrY--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/sdnfbqynuk0jlldzyr8t.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Woah, Netflix has some &lt;em&gt;realllyyy&lt;/em&gt; old movies and TV shows, some even released more than 80 years ago. Have you watched any of these?&lt;/p&gt;

&lt;p&gt;(&lt;strong&gt;Fun fact&lt;/strong&gt;: When he began implementing Python, Guido van Rossum was also reading the published scripts from &lt;a href="https://en.wikipedia.org/wiki/Monty_Python"&gt;“Monty Python’s Flying Circus”&lt;/a&gt;, a BBC comedy series from the 1970s (that was added on Netflix in 2018). Van Rossum thought he needed a name that was short, unique, and slightly mysterious, so he decided to call the language Python.)&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Does Netflix have the latest content?
&lt;/h3&gt;

&lt;p&gt;Yes, Netflix is cool and all for having content from a century ago, but does it also have the latest movies and TV shows? To find this out, first I calculated the difference between the date on which the content was added on Netflix and the release year of that content.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;df["year_diff"] = df["year_added"]-df["release_year"]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then, I created a scatter plot with x-axis as the &lt;em&gt;year difference&lt;/em&gt; and y-axis as the &lt;em&gt;number of movies/TV shows&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;col = "year_diff"
only_movies = df[df["duration"]!=""]
only_shows = df[df["season_count"]!=""]
grouped1 = only_movies[col].value_counts().reset_index()
grouped1 = grouped1.rename(columns = {col : "count", "index" : col})
grouped1 = grouped1.dropna()
grouped1 = grouped1.head(20)
grouped2 = only_shows[col].value_counts().reset_index()
grouped2 = grouped2.rename(columns = {col : "count", "index" : col})
grouped2 = grouped2.dropna()
grouped2 = grouped2.head(20)
with plt.xkcd():
    figure(num=None, figsize=(8, 5)) 
    plt.scatter(grouped1[col], grouped1["count"], color = "hotpink")
    plt.scatter(grouped2[col], grouped2["count"], color = '#88c999')
    values_int = range(0, math.ceil(max(grouped1[col]))+1, 2)
    plt.xticks(values_int)
    plt.xlabel("Difference between the year when the content has been\n added on Netflix and the realease year")
    plt.ylabel("Number of Movies/TV Shows")
    plt.legend(["Movies", "TV Shows"])
    plt.tight_layout()
    plt.show()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--olyW1w0P--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/54g5bcww2motog4f4y1u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--olyW1w0P--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/54g5bcww2motog4f4y1u.png" alt="Image description"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As you can see, the majority of the content on Netflix has been added within a year of its release date. So, Netflix does have the latest content most of the time!&lt;/p&gt;

&lt;p&gt;If you’re still here, here’s an xkcd comic for you, you’re welcome.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--j5H2gICb--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn-images-1.medium.com/max/1600/0%2AuYPz0xku_bVeAWVV.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--j5H2gICb--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn-images-1.medium.com/max/1600/0%2AuYPz0xku_bVeAWVV.png" alt="https://cdn-images-1.medium.com/max/1600/0*uYPz0xku_bVeAWVV.png"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  7. What kind of content is Netflix focusing upon?
&lt;/h3&gt;

&lt;p&gt;I also wanted to explore the &lt;code&gt;rating&lt;/code&gt; column and compare the amount of content that Netflix has been producing for kids, teens, and adults and if their focus has shifted from one group to the other over the years.&lt;/p&gt;

&lt;p&gt;To achieve this, first I took a look at the unique ratings in the dataframe:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;print(df['rating'].unique())
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;['TV-MA' 'R' 'PG-13' 'TV-14' 'TV-PG' 'NR' 'TV-G' 'TV-Y' nan 'TV-Y7' 'PG' 'G' 'NC-17' 'TV-Y7-FV' 'UR']
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then, I classified the ratings according to the groups (namely — &lt;em&gt;Little Kids&lt;/em&gt;, &lt;em&gt;Older Kids&lt;/em&gt;, &lt;em&gt;Teens,&lt;/em&gt; and &lt;em&gt;Mature&lt;/em&gt;) they fall into and changed their values in the &lt;code&gt;rating&lt;/code&gt; column to their group names.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ratings_group_list = ['Little Kids', 'Older Kids', 'Teens', 'Mature']
ratings_dict={
    'TV-G': 'Little Kids',
    'TV-Y': 'Little Kids',
    'G': 'Little Kids',
    'TV-PG': 'Older Kids',
    'TV-Y7': 'Older Kids',
    'PG': 'Older Kids',
    'TV-Y7-FV': 'Older Kids',
    'PG-13': 'Teens',
    'TV-14': 'Teens',
    'TV-MA': 'Mature',
    'R': 'Mature',
    'NC-17': 'Mature'
}
for rating_val, rating_group in ratings_dict.items():
    df.loc[df.rating == rating_val, "rating"] = rating_group
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally, I made line plots with &lt;em&gt;year&lt;/em&gt; on the x-axis and &lt;em&gt;content count&lt;/em&gt; on the y-axis.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;df['rating_val']=1
x=0
labels=['kinda\nless', 'not so\nbad', 'holy shit\nthat\'s too\nmany']
with plt.xkcd():
    for r in ratings_group_list:
        grouped = df[df['rating']==r]
        year_df = grouped.groupby(['year_added']).sum()
        year_df.reset_index(level=0, inplace=True)
        plt.plot(year_df['year_added'], year_df['rating_val'], color=colours[x], marker='o')
        values_int = range(2008, math.ceil(max(year_df['year_added']))+1, 2)
        plt.yticks([200, 600, 1000], labels)
        plt.xticks(values_int)
        plt.title('Count of shows and movies that Netflix\n has been producing for different audiences', fontsize=12)
        plt.xlabel('Year', fontsize=14)
        plt.ylabel('Content Count', fontsize=14)
        x+=1
    plt.legend(ratings_group_list)
    plt.tight_layout()
    plt.show()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--rAckInAQ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn-images-1.medium.com/max/1600/1%2ARFPCcXrWcHaeEIsKNxZkLg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--rAckInAQ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn-images-1.medium.com/max/1600/1%2ARFPCcXrWcHaeEIsKNxZkLg.png" alt="https://cdn-images-1.medium.com/max/1600/1*RFPCcXrWcHaeEIsKNxZkLg.png"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Okay, so the content count for mature audiences on Netflix is way more than the other groups. Another interesting observation is that there was a surge in the count of content produced for &lt;em&gt;Little Kids&lt;/em&gt; from &lt;em&gt;2019–2020&lt;/em&gt; whereas the content for &lt;em&gt;Older Kids&lt;/em&gt;, &lt;em&gt;Teens,&lt;/em&gt; and &lt;em&gt;Mature Audiences&lt;/em&gt; decreased during that time period.&lt;/p&gt;

&lt;h3&gt;
  
  
  8. Top Genres (Countrywise)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;col = "listed_in"
colours = ["violet", "cornflowerblue", "darkseagreen", "mediumvioletred", "blue", "mediumseagreen", "darkmagenta", "darkslateblue", "seagreen"]
countries_list = ["United States", "India", "United Kingdom", "Japan", "France", "Canada", "Spain", "South Korea", "Germany"]
with plt.xkcd():
    figure(num=None, figsize=(20, 8)) 
    x=1
    for country in countries_list:
        df["from_country"] = df['country'].fillna("").apply(lambda x : 1 if country.lower() in x.lower() else 0)
        small = df[df["from_country"] == 1]
        genre = ", ".join(small['listed_in'].fillna("")).split(", ")
        tags = Counter(genre).most_common(3)
        tags = [_ for _ in tags if "" != _[0]]
        labels, values = [_[0]+"  " for _ in tags][::-1], [_[1] for _ in tags][::-1]
        if max(values)&amp;gt;200:
            values_int = range(0, math.ceil(max(values)), 100)
        elif max(values)&amp;gt;100 and max(values)&amp;lt;=200:
            values_int = range(0, math.ceil(max(values))+50, 50)
        else:
            values_int = range(0, math.ceil(max(values))+25, 25)
        plt.subplot(3, 3, x)
        plt.barh(labels,values, color = colours[x-1])
        plt.xticks(values_int)
        plt.title(country)
        x+=1
    plt.suptitle('Top Genres')
    plt.tight_layout()
    plt.show()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--HW65HGex--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn-images-1.medium.com/max/2400/1%2A7mgX4bEYkVf8JLEP1qQBRg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--HW65HGex--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn-images-1.medium.com/max/2400/1%2A7mgX4bEYkVf8JLEP1qQBRg.png" alt="https://cdn-images-1.medium.com/max/2400/1*7mgX4bEYkVf8JLEP1qQBRg.png"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Key takeaways from this plot:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dramas and Comedies are the most popular genres in almost every country.&lt;/li&gt;
&lt;li&gt;Japan watches a LOT of anime!&lt;/li&gt;
&lt;li&gt;Romantic TV Shows and TV Dramas are big in South Korea. (I’m addicted to K-Dramas too, btw 😍)&lt;/li&gt;
&lt;li&gt;Children and Family Movies are the third most popular genre in Canada.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  9. Wordclouds
&lt;/h3&gt;

&lt;p&gt;I finally ended the project with two word clouds — first, a word cloud for the &lt;code&gt;description&lt;/code&gt; column and a second one for the &lt;code&gt;title&lt;/code&gt; column.&lt;/p&gt;

&lt;h3&gt;
  
  
  a. Wordcloud for Description:
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from wordcloud import WordCloud
import random
from PIL import Image
import matplotlib
# Custom colour map based on Netflix palette
cmap = matplotlib.colors.LinearSegmentedColormap.from_list("", ['#221f1f', '#b20710'])
text = str(list(df['description'])).replace(',', '').replace('[', '').replace("'", '').replace(']', '').replace('.', '')
mask = np.array(Image.open('../input/finallogo/New Note.png'))
wordcloud = WordCloud(background_color = 'white', width = 500,  height = 200,colormap=cmap, max_words = 150, mask = mask).generate(text)
plt.figure( figsize=(5,5))
plt.imshow(wordcloud, interpolation = 'bilinear')
plt.axis('off')
plt.tight_layout(pad=0)
plt.show()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Bziu7gBK--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn-images-1.medium.com/max/1600/1%2AoifDFya692J-7Q3bFnX-og.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Bziu7gBK--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn-images-1.medium.com/max/1600/1%2AoifDFya692J-7Q3bFnX-og.png" alt="https://cdn-images-1.medium.com/max/1600/1*oifDFya692J-7Q3bFnX-og.png"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Live, love, life, friend, family, world,&lt;/em&gt; and &lt;em&gt;find&lt;/em&gt; are some of the most frequent words to appear in the descriptions of movies and shows. Another interesting thing is that the words — &lt;em&gt;one, two, three,&lt;/em&gt; and &lt;em&gt;four&lt;/em&gt; — all appear in the word cloud.&lt;/p&gt;

&lt;h3&gt;
  
  
  b. Wordcloud for Title:
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cmap = matplotlib.colors.LinearSegmentedColormap.from_list("", ['#221f1f', '#b20710'])
text = str(list(df['title'])).replace(',', '').replace('[', '').replace("'", '').replace(']', '').replace('.', '')
mask = np.array(Image.open('../input/finallogo/New Note.png'))
wordcloud = WordCloud(background_color = 'white', width = 500,  height = 200,colormap=cmap, max_words = 150, mask = mask).generate(text)
plt.figure( figsize=(5,5))
plt.imshow(wordcloud, interpolation = 'bilinear')
plt.axis('off')
plt.tight_layout(pad=0)
plt.show()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--iJqX80lo--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn-images-1.medium.com/max/1600/1%2ABRbwBF9rIUO1F8OvvQjiVw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--iJqX80lo--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn-images-1.medium.com/max/1600/1%2ABRbwBF9rIUO1F8OvvQjiVw.png" alt="https://cdn-images-1.medium.com/max/1600/1*BRbwBF9rIUO1F8OvvQjiVw.png"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Do you see &lt;em&gt;Christmas&lt;/em&gt; right at the center of this word cloud? Seems like there is an abundance of Christmas movies on Netflix. Other popular words are — &lt;em&gt;Love, World, Man, Life, Story, Live, Secret, Girl, Boy, American, Game, Night, Last, Time,&lt;/em&gt; and &lt;em&gt;Day.&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  And that’s it!
&lt;/h3&gt;

&lt;p&gt;Working on projects like these is what makes Data Science fun!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If you want to add unique projects like this to your resume, join &lt;a href="https://buildtolearn.club/"&gt;Build To Learn Club&lt;/a&gt;.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I’m building it to help aspiring Data professionals build a “dangerously good” resume. It’s for Python enthusiasts who are tired of doing online courses.&lt;/p&gt;

&lt;p&gt;If you have any questions/feedback or would just like to chat, you can reach out to me on &lt;a href="https://twitter.com/paridhitweets"&gt;Twitter&lt;/a&gt; or &lt;a href="https://www.linkedin.com/in/paridhi-agarwal-23789b165/"&gt;LinkedIn&lt;/a&gt;. :)&lt;/p&gt;

</description>
      <category>python</category>
      <category>datascience</category>
    </item>
    <item>
      <title>A beginner-friendly guide to get Tailwind CSS up and running in your first web development project</title>
      <dc:creator>Paridhi Agarwal</dc:creator>
      <pubDate>Tue, 09 Feb 2021 18:13:57 +0000</pubDate>
      <link>https://dev.to/paridhi/a-beginner-friendly-guide-to-get-tailwind-css-up-and-running-in-your-first-web-development-project-58l7</link>
      <guid>https://dev.to/paridhi/a-beginner-friendly-guide-to-get-tailwind-css-up-and-running-in-your-first-web-development-project-58l7</guid>
      <description>&lt;p&gt;I've been hearing about &lt;a href="https://twitter.com/dhh/status/1349722147845976065?s=20"&gt;how cool&lt;/a&gt; &lt;a href="https://tailwindcss.com/"&gt;Tailwind CSS&lt;/a&gt; is, for quite sometime now so I decided to use it in a project I wanted to work on. &lt;/p&gt;

&lt;p&gt;When I went to the &lt;a href="https://tailwindcss.com/docs/installation"&gt;installation&lt;/a&gt; page on the official Tailwind website, I realized that the guide could be a little overwhelming for people who are just starting out. With this guide, I aim to make the installation process a little simpler.&lt;/p&gt;

&lt;p&gt;Ways to get Tailwind CSS up and running in your project:&lt;/p&gt;

&lt;p&gt;1. &lt;a href="https://play.tailwindcss.com/"&gt;&lt;strong&gt;Tailwind Play&lt;/strong&gt;&lt;/a&gt; - If you want to dive straight into writing code without having to worry about integrating Tailwind on your machine, this is the best option for you. You can start tinkering with your code and see what it does simultaneously. Also, with this option you don't have to dread writing that first line of code(like I do), since it already has a sample code for you to play with.🙂&lt;/p&gt;

&lt;p&gt;2.  &lt;a href="https://tailwindcss.com/docs/installation#using-tailwind-via-cdn"&gt;&lt;strong&gt;Using Tailwind via CDN&lt;/strong&gt;&lt;/a&gt; -  This is a great option for setting up Tailwind locally in no time. You just have to write a single line of code under the &lt;code&gt;&amp;lt;head&amp;gt;&lt;/code&gt; section of your &lt;em&gt;.html&lt;/em&gt; file and you can start working on your project straight away. &lt;/p&gt;

&lt;p&gt;&lt;code&gt;&amp;lt;link href="https://unpkg.com/tailwindcss@^2/dist/tailwind.min.css" rel="stylesheet"&amp;gt;&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This does have a few &lt;a href="https://tailwindcss.com/docs/installation#using-tailwind-via-cdn"&gt;limitations&lt;/a&gt; to it though, and therefore the Tailwind website doesn't recommend using this method. However, this is a good option if you hate long installation processes that make you want to quit coding. &lt;/p&gt;

&lt;p&gt;3. &lt;a href="https://tailwindcss.com/docs/installation#installing-tailwind-css-as-a-post-css-plugin"&gt;&lt;strong&gt;Installing Tailwind CSS as a PostCSS plugin&lt;/strong&gt;&lt;/a&gt; - This is the method that you should go for if you want to work on a real-world project and hope to upload the code to your Github repo. Since in this method we install Tailwind CSS using npm, the first step would be to &lt;a href="https://nodejs.org/en/download/"&gt;install &lt;em&gt;Node.js&lt;/em&gt;&lt;/a&gt; on your machine. Next, you need to follow the steps over &lt;a href="https://tailwindcss.com/docs/installation#installing-tailwind-css-as-a-post-css-plugin"&gt;here&lt;/a&gt; to complete setting up Tailwind locally. Don't worry if you don't understand a lot of the terms used in the guide. Just follow the steps and you'll be good to go!&lt;/p&gt;

&lt;p&gt;That's it!&lt;/p&gt;

&lt;p&gt;Do let me know if this guide helped you or if you have some feedback.&lt;/p&gt;

&lt;p&gt;You can follow my journey on &lt;a href="https://twitter.com/paridhitweets"&gt;Twitter&lt;/a&gt; as I try to learn web development by building a project using Tailwind CSS.&lt;/p&gt;

</description>
      <category>css</category>
      <category>codenewbie</category>
      <category>html</category>
      <category>webdev</category>
    </item>
  </channel>
</rss>
