<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Isac John Eralil</title>
    <description>The latest articles on DEV Community by Isac John Eralil (@isacje).</description>
    <link>https://dev.to/isacje</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3369031%2F8987f0e4-06a1-43ac-b451-a10fdc642f91.jpg</url>
      <title>DEV Community: Isac John Eralil</title>
      <link>https://dev.to/isacje</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/isacje"/>
    <language>en</language>
    <item>
      <title>Exploring Netflix Data with Python: A Developer’s Deep Dive</title>
      <dc:creator>Isac John Eralil</dc:creator>
      <pubDate>Sat, 19 Jul 2025 06:52:14 +0000</pubDate>
      <link>https://dev.to/isacje/exploring-netflix-data-with-python-a-developers-deep-dive-3f5l</link>
      <guid>https://dev.to/isacje/exploring-netflix-data-with-python-a-developers-deep-dive-3f5l</guid>
      <description>&lt;p&gt;As developers, we love digging into datasets to uncover interesting stories—and Netflix’s ever-growing catalog is no exception. In this post, I’ll share how I analyzed Netflix’s content data using Python and Pandas, plus tips for you to get started.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Dataset
&lt;/h2&gt;

&lt;p&gt;I used the popular Netflix Titles dataset from Kaggle (&lt;a href="https://www.kaggle.com/shivamb/netflix-shows" rel="noopener noreferrer"&gt;link here&lt;/a&gt;). It contains:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;6,000+ records of movies and TV shows&lt;/li&gt;
&lt;li&gt;Metadata like title, type, director, cast, country, date added, release year, rating, duration, and genres&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What I Did
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Data Cleaning &amp;amp; Preparation
&lt;/h3&gt;

&lt;p&gt;Using Pandas, I:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Filled missing values in &lt;code&gt;director&lt;/code&gt;, &lt;code&gt;cast&lt;/code&gt;, and &lt;code&gt;country&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Converted &lt;code&gt;date_added&lt;/code&gt; to datetime for time-series analysis&lt;/li&gt;
&lt;li&gt;Extracted year and month of addition&lt;/li&gt;
&lt;li&gt;Split genres into lists for better filtering&lt;/li&gt;
&lt;li&gt;Created new columns for movie duration and TV show seasons&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here’s a snippet:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;

&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;netflix_titles.csv&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;date_added&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to_datetime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;date_added&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;year_added&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;date_added&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;dt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;year&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;month_added&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;date_added&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;dt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;month&lt;/span&gt;

&lt;span class="c1"&gt;# Handle missing data
&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;director&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;fillna&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Unknown&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;inplace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;cast&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;fillna&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Various&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;inplace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;country&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;fillna&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Unknown&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;inplace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Exploratory Data Analysis (EDA)
&lt;/h3&gt;

&lt;p&gt;I focused on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Content type trends (movies vs TV shows) over years&lt;/li&gt;
&lt;li&gt;Top producing countries&lt;/li&gt;
&lt;li&gt;Most popular genres&lt;/li&gt;
&lt;li&gt;Movie durations and TV show season counts&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example plot using Seaborn:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;seaborn&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;sns&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;matplotlib.pyplot&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;

&lt;span class="n"&gt;sns&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;countplot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hue&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;year_added&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;palette&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;muted&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Movies vs TV Shows Over Years&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  What I Found
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;A clear shift towards &lt;strong&gt;more TV shows added since 2016&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;The US dominates content production, but countries like India and UK are growing fast&lt;/li&gt;
&lt;li&gt;Drama and Comedy lead genre counts, but Documentaries are on the rise&lt;/li&gt;
&lt;li&gt;Movies are mostly under 100 minutes, while TV shows average 1–3 seasons&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What’s Next?
&lt;/h2&gt;

&lt;p&gt;This dataset is perfect for experimenting with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Recommendation engines&lt;/li&gt;
&lt;li&gt;Sentiment analysis on descriptions&lt;/li&gt;
&lt;li&gt;Time-series forecasting of content growth&lt;/li&gt;
&lt;li&gt;Interactive dashboards with Plotly or Streamlit&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Check It Out!
&lt;/h2&gt;

&lt;p&gt;All code and notebooks are on GitHub:&lt;br&gt;
👉 &lt;a href="https://github.com/isacje/Netflix-Data-Analysis" rel="noopener noreferrer"&gt;https://github.com/isacje/Netflix-Data-Analysis&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Feel free to fork, run, and extend! And if you want help generating custom plots or automating your analysis pipeline, just ask.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Happy coding!&lt;/strong&gt;&lt;br&gt;
Isac&lt;/p&gt;




&lt;p&gt;Would you like me to help you draft a README-style intro or prepare example scripts for your repo to complement this post?&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
