<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Jakub Szafran</title>
    <description>The latest articles on DEV Community by Jakub Szafran (@jszafran).</description>
    <link>https://dev.to/jszafran</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F565175%2Fd0b4b73b-6258-4acc-9530-9019a0408734.jpeg</url>
      <title>DEV Community: Jakub Szafran</title>
      <link>https://dev.to/jszafran</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/jszafran"/>
    <language>en</language>
    <item>
      <title>Ready for a little Python challenge?</title>
      <dc:creator>Jakub Szafran</dc:creator>
      <pubDate>Thu, 16 Nov 2023 09:47:07 +0000</pubDate>
      <link>https://dev.to/jszafran/ready-for-a-little-python-challenge-20fb</link>
      <guid>https://dev.to/jszafran/ready-for-a-little-python-challenge-20fb</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Below exercise comes from a &lt;strong&gt;Quest Of Python&lt;/strong&gt; - a little side-project of mine where I share Python challenges/exercises with exemplary solution. If you've enjoyed it and would like to practice more, go check &lt;a href="https://questofpython.dev"&gt;Quest Of Python website&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;You've come up with a great side-project idea - let's analyze information about IT job market in Poland!&lt;/p&gt;

&lt;p&gt;While browsing &lt;a href="https://justjoin.it"&gt;JustJoin.it&lt;/a&gt; job board, you noticed that all job offers are served as JSON through an HTTP API. Since it's publicly available, you decided to create a small application to fetch &amp;amp; store this data.&lt;/p&gt;

&lt;p&gt;You decide to go with AWS cloud for hosting your app.&lt;/p&gt;

&lt;p&gt;Your workload includes a short Lambda function written in Python (which fetches the data from job offers API endpoint and persists JSON data into S3 bucket) which is executed on a daily schedule (through AWS EventBridge trigger). Each successful run of the function creates a new object in S3, following &lt;code&gt;s3://some-s3-bucket-name/justjoinit-data/&amp;lt;year&amp;gt;/&amp;lt;month&amp;gt;/&amp;lt;day&amp;gt;.json&lt;/code&gt; naming convention.&lt;/p&gt;

&lt;p&gt;You quickly test it and everything seems to be fine. Then you deploy resources to your AWS account and forget about whole thing for a long time.&lt;/p&gt;

&lt;p&gt;Recently you decided to revive this project and try to extract something meaningful from this data. You quickly realize there are gaps in the data (some days are missing). Turns out that you were so confident about your code that you did not include any retry in case of HTTP request failure. Shame on you!&lt;/p&gt;

&lt;h2&gt;
  
  
  Your task
&lt;/h2&gt;

&lt;p&gt;Clone the &lt;a href="https://github.com/quest-of-python/challenges-blueprints"&gt;challenges blueprints repository&lt;/a&gt; and navigate to &lt;code&gt;0005_justjoinit_data_finding_the_gaps&lt;/code&gt; directory.&lt;br&gt;
It contains a directory called &lt;code&gt;justjoinit_data&lt;/code&gt; which is supposed to mimic the structure of original S3 bucket with raw data - each year of data is a separate directory containing a subdirectories with months (and each month directory contains multiple JSON files representing a single day of data).&lt;/p&gt;

&lt;p&gt;Here's an output of &lt;code&gt;tree&lt;/code&gt; command on this directory:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;justjoinit_data
├── 2021
│   ├── 10
│   │   ├── 23.json
│   │   ├── 24.json
│   │   ├── 25.json
│   │   ├── 26.json
│   │   ├── 27.json
│   │   ├── 28.json
│   │   ├── 29.json
│   │   ├── 30.json
│   │   └── 31.json
│   ├── 11
│   │   ├── 01.json
│   │   ├── 02.json
│   │   ├── 03.json

...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your task is to find out which dates (JSON files) are missing from &lt;code&gt;justjoinit_directory&lt;/code&gt; directory (which would be days when our small AWS job failed due to some reason).&lt;/p&gt;

&lt;p&gt;Put your logic into &lt;code&gt;find_missing_dates&lt;/code&gt; function (inside &lt;code&gt;missing_dates.py&lt;/code&gt; file). Missing dates should be returned as a string of dates joined by comma and a space character. If &lt;code&gt;2021-01-01&lt;/code&gt;, &lt;code&gt;2021-03-05&lt;/code&gt; and &lt;code&gt;2022-05-10&lt;/code&gt; were the missing dates, the result string would look like following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="s"&gt;"2021-01-01, 2021-03-05, 2022-05-10"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can assume that directories will always be named after valid month (&lt;code&gt;1 &amp;lt;= month &amp;lt;= 12&lt;/code&gt;) or day (&lt;code&gt;1 &amp;lt;= day &amp;lt;= 31&lt;/code&gt;) and days within specific months are correct (for example there are no dates like February 31st).&lt;/p&gt;

&lt;p&gt;You can use a test from &lt;code&gt;test_missing_dates.py&lt;/code&gt; file to check if your solution is correct. Run below command (while being in &lt;code&gt;0005_justjoinit_data_finding_the_gaps&lt;/code&gt; directory) to run the test suite:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python &lt;span class="nt"&gt;-m&lt;/span&gt; unittest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;or&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python test_missing_dates.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;P.S. I plan to share this JustJoin.it job offers dataset publicly (probably on Kaggle). Once this is done, I'll update this page and provide the link to the dataset.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Exemplary solution
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: you'll find the detailed explanation of the solution below the code snippet.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;pathlib&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timedelta&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;find_missing_dates&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_directory&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;pathlib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;dates_from_disk&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;input_directory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;glob&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"**/*.json"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;year&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;month&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;day&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parts&lt;/span&gt;

        &lt;span class="n"&gt;dates_from_disk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;year&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;year&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="n"&gt;month&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;month&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="n"&gt;day&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;day&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".json"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;start_date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dates_from_disk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;end_date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dates_from_disk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;difference_in_days&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;end_date&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start_date&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;days&lt;/span&gt;
    &lt;span class="n"&gt;expected_dates&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;start_date&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;timedelta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;days&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;difference_in_days&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;

    &lt;span class="n"&gt;missing_dates&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;expected_dates&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;dates_from_disk&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;", "&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strftime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"%Y-%m-%d"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;missing_dates&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Our solution for this challenge will leverage sets and operations they provide (sets difference). Steps we'll take:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;create a set of all dates existing within &lt;code&gt;justjoinit_data&lt;/code&gt; directory (set &lt;code&gt;dates_from_disk&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;calculate earliest and latest date from &lt;code&gt;dates_from_disk&lt;/code&gt; set&lt;/li&gt;
&lt;li&gt;create a set of expected dates &lt;code&gt;expected_dates&lt;/code&gt; containing all dates from range between earliest and latest dates calculated in previous step&lt;/li&gt;
&lt;li&gt;calculate a difference between &lt;code&gt;expected_dates&lt;/code&gt; and &lt;code&gt;dates_from_disk&lt;/code&gt; (dates existing in &lt;code&gt;expected_dates&lt;/code&gt; but missing from &lt;code&gt;dates_from_disk&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;sort dates chronologically and transform them to conform to expected string format&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We start with defining an empty set &lt;code&gt;dates_from_disk&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Glob pattern &lt;code&gt;**/*.json&lt;/code&gt; allows us to iterate over all files with &lt;code&gt;.json&lt;/code&gt; extension (&lt;code&gt;**&lt;/code&gt; means traversing &lt;code&gt;justjoinit_data&lt;/code&gt; directory and all its subdirectories &lt;strong&gt;recursively&lt;/strong&gt;).&lt;/p&gt;

&lt;p&gt;To extract year, month and day info, we leverage path's &lt;code&gt;.parts&lt;/code&gt; attribute - a tuple containing the individual components of path:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;pathlib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/some/path/justjoinit_data/2022/10/01.json"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;parts&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'/'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'some'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'path'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'justjoinit_data'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'2022'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'10'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'01.json'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Tuple unpacking lets us conveniently capture year, month and day variables within single line. Every part that comes before year part is captured within &lt;code&gt;_&lt;/code&gt; variable (it's Pythonic way of saying that you don't care about something). We also combine it with asterisk &lt;code&gt;*&lt;/code&gt;, which means that &lt;code&gt;_&lt;/code&gt; variable can hold multiple elements.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;year&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;month&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;day&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parts&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;year&lt;/span&gt;
&lt;span class="s"&gt;'2022'&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;month&lt;/span&gt;
&lt;span class="s"&gt;'10'&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;day&lt;/span&gt;
&lt;span class="s"&gt;'01.json'&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;
&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'/'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'some'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'path'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'justjoinit_data'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After a small cleanup (removing &lt;code&gt;.json&lt;/code&gt; suffix from &lt;code&gt;day&lt;/code&gt; and converting &lt;code&gt;day&lt;/code&gt;, &lt;code&gt;month&lt;/code&gt;, &lt;code&gt;year&lt;/code&gt; to integers), we're able to construct a valid &lt;code&gt;date&lt;/code&gt; object and add it to &lt;code&gt;dates_from_disk&lt;/code&gt; set:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;        &lt;span class="n"&gt;dates_from_disk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;year&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;year&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="n"&gt;month&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;month&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="n"&gt;day&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;day&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".json"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After for loop is done, &lt;code&gt;dates_from_disk&lt;/code&gt; contains all the dates existing in the &lt;code&gt;justjoinit_data&lt;/code&gt; directory.&lt;/p&gt;

&lt;p&gt;We use built-in &lt;code&gt;min&lt;/code&gt; and &lt;code&gt;max&lt;/code&gt; functions to calculate earliest and latest date. We use these dates to calculate a helper variable called &lt;code&gt;difference_in_days&lt;/code&gt;, which is later used for generating a range of expected dates between &lt;code&gt;start_date&lt;/code&gt; and &lt;code&gt;end_date&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;expected_dates&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;start_date&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;timedelta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;days&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;difference_in_days&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To find the missing dates within &lt;code&gt;justjoinit_data&lt;/code&gt; directory, we simply calculate the difference between &lt;code&gt;expected_dates&lt;/code&gt; and &lt;code&gt;dates_from_disk&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;missing_dates&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;expected_dates&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;dates_from_disk&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Last thing we do is sorting the dates (&lt;code&gt;sorted(missing_dates)&lt;/code&gt;, transforming them to strings with &lt;code&gt;.strftime("%Y-%m-%d")&lt;/code&gt; string method and joining with &lt;code&gt;", "&lt;/code&gt; string (so it matches the expected format from task's description).&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;I hope you enjoyed this little exercise. I encourage you to check &lt;a href="https://questofpython.dev"&gt;Quest Of Python&lt;/a&gt; for more :-)!&lt;/p&gt;

</description>
      <category>python</category>
      <category>programming</category>
      <category>learning</category>
    </item>
  </channel>
</rss>
