<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Cody Meadows</title>
    <description>The latest articles on DEV Community by Cody Meadows (@cmeadowstech).</description>
    <link>https://dev.to/cmeadowstech</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F832331%2Fcca91826-43a3-427b-a05c-64d8853de9dc.jpg</url>
      <title>DEV Community: Cody Meadows</title>
      <link>https://dev.to/cmeadowstech</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/cmeadowstech"/>
    <language>en</language>
    <item>
      <title>Running Matomo &amp; GlitchTip with Dokku</title>
      <dc:creator>Cody Meadows</dc:creator>
      <pubDate>Tue, 07 Nov 2023 20:22:32 +0000</pubDate>
      <link>https://dev.to/cmeadowstech/running-matomo-glitchtip-with-dokku-3pn3</link>
      <guid>https://dev.to/cmeadowstech/running-matomo-glitchtip-with-dokku-3pn3</guid>
      <description>&lt;p&gt;I've been meaning to set up analytics and error reporting for my sites for a while now, and finally got around to it today. I'd heard Dokku was a good way to deploy resources like this, and while it worked pretty well in the end, some of the examples for these apps were quite dated. &lt;/p&gt;

&lt;p&gt;Thought I'd share my new instructions on these.&lt;/p&gt;

&lt;h2&gt;
  
  
  Matomo
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/rclement/dokku-matomo"&gt;https://github.com/rclement/dokku-matomo&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;I originally found this guide, but some of the instructions are dated and gave me trouble. Below should be a more accurate guide on running Matomo with the current version of Dokku.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Deploy your own instance of Matomo on Dokku.
&lt;/h3&gt;

&lt;p&gt;This setup is makes use of the great ready-to-use &lt;a href="https://github.com/crazy-max/docker-matomo"&gt;matomo-docker&lt;/a&gt; image by &lt;a href="https://github.com/crazy-max"&gt;crazy-max&lt;/a&gt;, which does all of the heavy-lifting to properly deploy Matomo without too much headache.&lt;/p&gt;

&lt;p&gt;Requirements&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/dokku/dokku"&gt;Dokku&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/dokku/dokku-mariadb"&gt;dokku-mariadb&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/dokku/dokku-letsencrypt"&gt;dokku-letsencrypt&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;App and database&lt;/p&gt;

&lt;p&gt;First create a new Dokku app. We will call it matomo.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dokku apps:create matomo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next create the MariaDB database required by Matomo and link it to the app.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dokku mariadb:create mariadb-matomo
dokku mariadb:link mariadb-matomo matomo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Configuration
&lt;/h3&gt;

&lt;p&gt;Main configuration&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dokku config:set &lt;span class="nt"&gt;--no-restart&lt;/span&gt; matomo &lt;span class="nv"&gt;TZ&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;America/Chicago
dokku config:set &lt;span class="nt"&gt;--no-restart&lt;/span&gt; matomo &lt;span class="nv"&gt;MEMORY_LIMIT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;256M
dokku config:set &lt;span class="nt"&gt;--no-restart&lt;/span&gt; matomo &lt;span class="nv"&gt;UPLOAD_MAX_SIZE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;16M
dokku config:set &lt;span class="nt"&gt;--no-restart&lt;/span&gt; matomo &lt;span class="nv"&gt;OPCACHE_MEM_SIZE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;128
dokku config:set &lt;span class="nt"&gt;--no-restart&lt;/span&gt; matomo &lt;span class="nv"&gt;REAL_IP_FROM&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.0.0.0/32
dokku config:set &lt;span class="nt"&gt;--no-restart&lt;/span&gt; matomo &lt;span class="nv"&gt;REAL_IP_HEADER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;X-Forwarded-For
dokku config:set &lt;span class="nt"&gt;--no-restart&lt;/span&gt; matomo &lt;span class="nv"&gt;LOG_LEVEL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;WARN
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Persistent storage
&lt;/h3&gt;

&lt;p&gt;You need to mount a volume on your host (the machine running Dokku) to persist all settings that you set in the Matomo interface.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir&lt;/span&gt; /var/lib/dokku/data/storage/matomo
&lt;span class="c"&gt;# UID:GUID are set to 101. These are the values the nginx image uses,&lt;/span&gt;
&lt;span class="c"&gt;# that is used by crazymax/matomo&lt;/span&gt;
&lt;span class="nb"&gt;chown &lt;/span&gt;101:101 /var/lib/dokku/data/storage/matomo
dokku storage:mount matomo /var/lib/dokku/data/storage/matomo:/data
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Domain setup
&lt;/h3&gt;

&lt;p&gt;To get the routing working, we need to apply a few settings. First we set the domain.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dokku domains:add matomo matomo.example.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We also need to update the ports set by Dokku.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dokku ports:set matomo http:80:8000
dokku ports:remove matomo http:80:5000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If Dokku proxy:report shows more than one port mapping, remove all port mappings except the added above.&lt;/p&gt;

&lt;h3&gt;
  
  
  Deploy app for the first time
&lt;/h3&gt;

&lt;p&gt;Deploy Matomo from latest version of the docker image&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dokku git:from-image matomo crazymax/matomo:latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Setup Let's Encrypt&lt;/p&gt;

&lt;p&gt;Setup an SSL certificate via Let's Encrypt.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dokku letsencrypt:set matomo email user@domain.com
dokku letsencrypt:enable matomo
dokku letsencrypt:auto-renew matomo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Grep MariaDB information for the setup
&lt;/h3&gt;

&lt;p&gt;We will need to set up Matomo in the web interface and provide the database details. You should be able to access the page via &lt;a href="https://matomo.example.com"&gt;https://matomo.example.com&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Run the command below to retrieve the DSN.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dokku mariadb:info mariadb-matomo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;An example DSN might look like this: mysql://mariadb:ffd4fc238ba8adb3@dokku-mariadb-mariadb-matomo:3306/mariadb_matomo. Copy and paste the details as follows:&lt;/p&gt;

&lt;p&gt;Hostname: dokku-mariadb-mariadb-matomo&lt;br&gt;
Username: mariadb&lt;br&gt;
Password: ffd4fc238ba8adb3&lt;br&gt;
Database Name: mariadb_matomo&lt;/p&gt;

&lt;p&gt;After going through the setup, you should be able to use Matomo.&lt;/p&gt;
&lt;h2&gt;
  
  
  GlitchTip
&lt;/h2&gt;

&lt;p&gt;I was originally planning to use Sentry, but they've really made it a pain to self-host. They pretty much require you to use docker compose, which Dokku doesn't appear to have solid support for. Or at least, documentation.&lt;/p&gt;

&lt;p&gt;I would probably need a dedicated machine or proper Docker environment to run it alongside other applications such as Matomo.&lt;br&gt;
So I looked for lighter weight error reporting applications and came across &lt;a href="https://glitchtip.com/"&gt;GlitchTip&lt;/a&gt;, which seems to fit the bill.&lt;/p&gt;

&lt;p&gt;Didn't see any Dokku instructions for this one, but the following worked for me:&lt;/p&gt;
&lt;h3&gt;
  
  
  Install Dokku plugins
&lt;/h3&gt;

&lt;p&gt;Go to your dokku server and install following plugins:&lt;/p&gt;

&lt;p&gt;Install official postgresql plugin&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;dokku plugin:install https://github.com/dokku/dokku-postgres.git postgres
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Install official redis plugin&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;dokku plugin:install https://github.com/dokku/dokku-redis.git redis
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Prepare dokku
&lt;/h3&gt;

&lt;p&gt;Create dokku app&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dokku apps:create glitchtip
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create postgresql db and link it to the app&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dokku postgres:create glitchtip
dokku postgres:link glitchtip glitchtip
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create redis instance and link it to the app&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dokku redis:create glitchtip
dokku redis:link glitchtip glitchtip
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Set &lt;a href="https://glitchtip.com/documentation/install#configuration"&gt;required variables&lt;/a&gt; for GlitchTip
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dokku config:set &lt;span class="nt"&gt;--no-restart&lt;/span&gt; glitchtip &lt;span class="nv"&gt;SECRET_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="sb"&gt;`&lt;/span&gt;openssl rand &lt;span class="nt"&gt;-base64&lt;/span&gt; 64&lt;span class="sb"&gt;`&lt;/span&gt; | &lt;span class="nb"&gt;tr&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;' '&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
dokku config:set &lt;span class="nt"&gt;--no-restart&lt;/span&gt; glitchtip &lt;span class="nv"&gt;GLITCHTIP_DOMAIN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;https://glitchtip.example.com
dokku config:set &lt;span class="nt"&gt;--no-restart&lt;/span&gt; glitchtip &lt;span class="nv"&gt;GLITCHTIP_DOMAIN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"smtp://email:password@smtp_url:port"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Domain setup
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dokku domains:add glitchtip glitchtip.example.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We also need to update the ports set by Dokku.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dokku ports:set glitchtip http:80:8080
dokku ports:remove glitchtip http:80:5000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If Dokku proxy:report shows more than one port mapping, remove all port mappings except the added above.&lt;/p&gt;

&lt;h3&gt;
  
  
  Deploy
&lt;/h3&gt;

&lt;p&gt;Deploy Matomo from their docker image&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dokku git:from-image glitchtip glitchtip/glitchtip
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Setup Let's Encrypt
&lt;/h3&gt;

&lt;p&gt;Setup an SSL certificate via Let's Encrypt.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dokku letsencrypt:set glitchtip email user@domain.com
dokku letsencrypt:enable glitchtip
dokku letsencrypt:auto-renew glitchtip
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should then be able to access glitchtip.example.com and create the first user account.&lt;/p&gt;

</description>
      <category>dokku</category>
      <category>matomo</category>
      <category>glitchtip</category>
    </item>
    <item>
      <title>Scraping, Visualizing r/tea</title>
      <dc:creator>Cody Meadows</dc:creator>
      <pubDate>Sun, 19 Feb 2023 18:48:58 +0000</pubDate>
      <link>https://dev.to/cmeadowstech/scraping-visualizing-rtea-4ho0</link>
      <guid>https://dev.to/cmeadowstech/scraping-visualizing-rtea-4ho0</guid>
      <description>&lt;h1&gt;
  
  
  Reddit Tea Scraper
&lt;/h1&gt;

&lt;p&gt;Occasionally I like to find new online tea vendors to buy tea from, and when I do I would visit r/tea to see what was recommended. However I noticed that there were a bunch of new suggestions scattered all over the place, so I figured I would spend way too much time scraping all of those vendors to see who was most recommended.&lt;/p&gt;

&lt;p&gt;Unfortunately, scraping an entire year's worth of data ended up being more difficult than I originally thought, in part due to Reddit's API utilization limits.&lt;/p&gt;

&lt;h2&gt;
  
  
  Scraping Data
&lt;/h2&gt;

&lt;p&gt;I started out making requests manually to the API endpoint, but because of the way it returns comments, I ended up switching to the &lt;a href="https://praw.readthedocs.io/en/stable/" rel="noopener noreferrer"&gt;PRAW&lt;/a&gt; (Python Reddit API Wrapper) package to handle much of the logic. No need to reinvent the wheel when PRAW already handles tasks such as fetching all comments and metadata in a much easier-to-parse manner.&lt;/p&gt;

&lt;p&gt;PRAW too ended up having its own limitations though. It respected Reddit's official API limitations of 1,000 submissions and there did not seem to be a good way to filter by date to batch my queries. As a result, I turned to &lt;a href="https://github.com/pushshift/api" rel="noopener noreferrer"&gt;Pushshift&lt;/a&gt; - a third-party Reddit API that allows you to query its own database of Reddit submissions and comments.&lt;/p&gt;

&lt;p&gt;Unfortunately, it too had some issues - due to a data migration dating back to late last year, its database was missing submissions prior to November. Fortunately, they had their data dumps stored in monthly chunks that you could download &lt;a href="https://files.pushshift.io/reddit/submissions/" rel="noopener noreferrer"&gt;from their site.&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In the end, I used a combination of:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Pushshift data dumps to get all submissions for 2022&lt;/li&gt;
&lt;li&gt;PRAW to get all of their comments
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;RS_2022-12&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;encoding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;utf8&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;smb&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;smb&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;subreddit_id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;t5_2qq5e&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;smb&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;score&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nf"&gt;pprint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;smb&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;permalink&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

            &lt;span class="n"&gt;comments&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

            &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;submission&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;reddit&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;submission&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;smb&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

                &lt;span class="n"&gt;submission&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;comments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace_more&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;comment&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;submission&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;comments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
                    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;comment&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="n"&gt;comments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;comment&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;body_html&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I had to download the dumps and process them in batches by month because the uncompressed files hit up ~130GB but allowed me to achieve what I wanted in regard to reliably getting all submissions and their comments for 2022.&lt;/p&gt;

&lt;p&gt;I did apply some conditions to filter out low-quality submissions and comments.&lt;/p&gt;

&lt;h2&gt;
  
  
  Storing Data
&lt;/h2&gt;

&lt;p&gt;While I could have applied my logic to this data while scraping it, I wanted to break up the flow and simply store the data I needed for processing somewhere until I was ready to do something with it. This would also help me make changes to how I extract references to vendors.&lt;/p&gt;

&lt;p&gt;I first considered storing it in a CSV or JSON file, but both seemed clunky ways to store data like this. Instead, I choose to write all of this to a Postgres database using the &lt;a href="https://pypi.org/project/psycopg2/" rel="noopener noreferrer"&gt;psycopg2&lt;/a&gt; package.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;pg_connect&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;psycopg2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;xxx&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;database&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;xxx&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;xxx&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;password&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;POSTGRES_PW&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Database connected successfully&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="bp"&gt;...&lt;/span&gt;

    &lt;span class="n"&gt;cur&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pg_connect&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="n"&gt;cur&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
                            INSERT INTO tea (postid, title, url, permalink, selftext, comments, created_utc)
                            VALUES (%s, %s, %s, %s, %s, ARRAY [%s], %s)
                            ON CONFLICT DO NOTHING
                        &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;submission&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;submission&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;submission&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;submission&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;permalink&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;submission&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;selftext_html&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;comments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;utcfromtimestamp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;submission&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;created_utc&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;strftime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;%Y-%m-%d %H:%M:%S&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
                        &lt;span class="n"&gt;pg_connect&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                    &lt;span class="nf"&gt;except &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;psycopg2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DatabaseError&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="bp"&gt;...&lt;/span&gt;

&lt;span class="n"&gt;pg_connect&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;close&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Processing Data
&lt;/h2&gt;

&lt;p&gt;One benefit of using a database like Postgres was that I could query from it instead of having to write a separate Python script to do it for me. &lt;/p&gt;

&lt;p&gt;This may not be the 'best' query but it returned to me what I needed. I used UNNEST to separate the array items in the "Comments" array and the REGEXP_MATCHES with a very loose pattern to match the domain.com syntax.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;POSTID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;PERMALINK&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;REGEXP_MATCHES&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;UNNEST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;COMMENTS&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="se"&gt;\w&lt;/span&gt;&lt;span class="s1"&gt;{4,}(?&amp;lt;!wordpress|blogspot|amazon|reddit|imgur|youtu.*|seriouseats|aliexpress|etsy|facebook|github|wikipedia|redd)&lt;/span&gt;&lt;span class="se"&gt;\.&lt;/span&gt;&lt;span class="s1"&gt;(?!png|jpg|html|wordpress|pdf|htm|php|blogspot)[a-z]+'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'g'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;WEBSITE_DOMAIN&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;TEA&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;permalink&lt;/span&gt; &lt;span class="o"&gt;!~&lt;/span&gt; &lt;span class="s1"&gt;'marketing_monday'&lt;/span&gt;
&lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;permalink&lt;/span&gt; &lt;span class="o"&gt;!~&lt;/span&gt; &lt;span class="s1"&gt;'daily_discussion'&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;WEBSITE_DOMAIN&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Normally I would have liked to count the results via the query as well, but the way Postgres projects the results AS WEBSITE_DOMAIN made it difficult for me to do so with my limited Postgres knowledge. Instead, I exported the results to a CSV and used pivot tables to calculate the results for me.&lt;/p&gt;

&lt;h2&gt;
  
  
  Visualizing Data
&lt;/h2&gt;

&lt;p&gt;Now that I had what I wanted - what do I do with it? I thought it would be neat to visualize the top 20 results.&lt;/p&gt;

&lt;p&gt;To this end, I used the &lt;a href="https://github.com/vizzuhq/ipyvizzu" rel="noopener noreferrer"&gt;ipyvizzu&lt;/a&gt; package in combination with a jupyter notebook to create an animated graph with my results.&lt;/p&gt;

&lt;p&gt;I could probably write up an entire separate article on this package as I found the documentation for both it and the underlying Vizzu documentation to be lacking in some aspects. The general synopsis is that it allows you to take your data and animate it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;chart&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;animate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nc"&gt;Config&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;channels&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;y&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;set&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Vendor&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;labels&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;x&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;set&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;range&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;100%&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;# of References on r/tea&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;interlacing&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;label&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;attach&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Vendor&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]},&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;color&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;set&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Vendor&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reverse&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Most Popular Online Vendors 2022&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="nc"&gt;Style&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;plot&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;backgroundColor&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;#1b2e24&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;duration&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;easing&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ease&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Was more work than I expected to animate this to my liking, but I'm pretty happy with the end result:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F30980rh1kd688hlfos5y.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F30980rh1kd688hlfos5y.gif" alt="Animated graph" width="587" height="413"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Find the full GitHub Repo here: &lt;a href="https://github.com/cmeadowstech/Reddit-Tea-Scraper" rel="noopener noreferrer"&gt;https://github.com/cmeadowstech/Reddit-Tea-Scraper&lt;/a&gt;&lt;/p&gt;

</description>
      <category>writing</category>
      <category>productivity</category>
      <category>discuss</category>
    </item>
    <item>
      <title>Website... Unchained</title>
      <dc:creator>Cody Meadows</dc:creator>
      <pubDate>Sat, 07 Jan 2023 16:18:55 +0000</pubDate>
      <link>https://dev.to/cmeadowstech/website-unchained-3b0e</link>
      <guid>https://dev.to/cmeadowstech/website-unchained-3b0e</guid>
      <description>&lt;h2&gt;
  
  
  Overview
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
Overview

&lt;ul&gt;
&lt;li&gt;Models and Views&lt;/li&gt;
&lt;li&gt;settings.py&lt;/li&gt;
&lt;li&gt;Cosmos MongoDB&lt;/li&gt;
&lt;li&gt;Azure Storage backend&lt;/li&gt;
&lt;li&gt;Security&lt;/li&gt;
&lt;li&gt;Terraform&lt;/li&gt;
&lt;li&gt;Security&lt;/li&gt;
&lt;li&gt;Pipelines&lt;/li&gt;
&lt;li&gt;Dockerfile&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;The original version of my website was created for the &lt;a href="https://cloudresumechallenge.dev/docs/the-challenge/azure/" rel="noopener noreferrer"&gt;Cloud Resume Challenge&lt;/a&gt;, but as I've progressed after achieving the AZ-104 certification I decided to break away from its mold a bit. I started to get more focused on Python and wanted to redo my site with Django. I also wanted to implement IAC via Terraform and CI/CD via GitHub Actions, to implement things properly. &lt;/p&gt;

&lt;p&gt;A little overkill for a single-page site, but I thought it was a fun project and it allows me to better expand in the future.&lt;/p&gt;

&lt;p&gt;I delved into Django a bit last summer, and IAC / CI/CD the last several months, but I want to thank the following sources for helping me get through this project:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://developer.mozilla.org/en-US/docs/Learn/Server-side/Django" rel="noopener noreferrer"&gt;mdn web docs&lt;/a&gt; - Beginner's Guide to Django&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simpleisbetterthancomplex.com/series/2017/09/04/a-complete-beginners-guide-to-django-part-1.html" rel="noopener noreferrer"&gt;simple is better than complex&lt;/a&gt; - Another beginner's guide to Django&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.terraformupandrunning.com/" rel="noopener noreferrer"&gt;Terraform: Up and Running&lt;/a&gt; - Excellent read on using Terraform&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/Azure/actions-workflow-samples/blob/master/AppService/python-webapp-on-azure.yml" rel="noopener noreferrer"&gt;Deploying Terraform at scale with GitHub Actions&lt;/a&gt; - A very straight-forward article by Facundo Gauna on using GitHub Actions to deploy Terraform
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;    ├───app
    │   ├───cmeadows_tech           &lt;span class="c"&gt;# Project folder&lt;/span&gt;
    │   ├───home                    &lt;span class="c"&gt;# Main app folder&lt;/span&gt;
    └───infra
        ├───global                  &lt;span class="c"&gt;# Global Terraform IAC&lt;/span&gt;
        └───web                     &lt;span class="c"&gt;# Terraform IAC for running app&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Models and Views
&lt;/h2&gt;

&lt;p&gt;Pretty straightforward for a single-page site. At the moment I have one model for my projects, so in the future I can more easily add, remove and modify them. You know, your basic CRUD operations.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Project&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Model&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;CharField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_length&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;help_text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;The title of the project&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;URLField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;help_text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;The URL for the project&lt;/span&gt;&lt;span class="se"&gt;\'&lt;/span&gt;&lt;span class="s"&gt;s repo or blog post&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;blank&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;tech&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;CharField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_length&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;help_text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Comma separated list of technologies used by the project&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;synopsis&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;TextField&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;help_text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Summary of the project&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the future I would like to add another model for a more extended project description, akin to a blog post, so visitors can read more about my projects without having to follow through to another site.&lt;/p&gt;

&lt;h2&gt;
  
  
  settings.py
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Cosmos MongoDB
&lt;/h3&gt;

&lt;p&gt;To try and save on costs and because I like NoSQL databases, I wanted to put my database up on Cosmos DB. The easiest way to do this was with its MongoDB API and the &lt;a href="https://github.com/doableware/djongo" rel="noopener noreferrer"&gt;Djongo&lt;/a&gt; backend.&lt;/p&gt;

&lt;p&gt;This did come with some difficulties though, as the package dependencies were a mess after updates to Python3 and Django itself.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/doableware/djongo/issues/171" rel="noopener noreferrer"&gt;This open issue&lt;/a&gt; has an ongoing conversation on the problems I faced, and is what helped me correct my dependencies.&lt;/p&gt;

&lt;p&gt;These packages work together:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;asgiref&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="mf"&gt;3.6&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="n"&gt;Django&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="mf"&gt;4.1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;
&lt;span class="n"&gt;djongo&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="mf"&gt;1.3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;
&lt;span class="n"&gt;dnspython&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="mf"&gt;2.2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;pymongo&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="mf"&gt;3.12&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="n"&gt;pytz&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="mf"&gt;2022.7&lt;/span&gt;           &lt;span class="c1"&gt;# This  must be installed manually, it is not included in djongo
&lt;/span&gt;&lt;span class="n"&gt;sqlparse&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;
&lt;span class="n"&gt;tzdata&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="mf"&gt;2022.7&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Note: Cosmos uses a different port than the default MongoDB port, so you have to specify it in the connection string.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  Azure Storage backend
&lt;/h3&gt;

&lt;p&gt;I also wanted to use Azure Storage to serve my static files, which luckily was pretty straightforward to set up with the &lt;a href="https://django-storages.readthedocs.io/en/latest/backends/azure.html" rel="noopener noreferrer"&gt;Azure Storage backend.&lt;/a&gt; In the future I might look into setting up a CDN to serve these files as well.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;AZURE_ACCOUNT_NAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;storage account name&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;AZURE_CONTAINER&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;static&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
&lt;span class="n"&gt;AZURE_ACCOUNT_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;storage account key&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
&lt;span class="n"&gt;AZURE_OVERWRITE_FILES&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;True&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;

&lt;span class="n"&gt;STATIC_URL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;static/&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;

&lt;span class="n"&gt;STATIC_DIRS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
&lt;span class="n"&gt;BASE_DIR&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;static&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Security
&lt;/h3&gt;

&lt;p&gt;Of course, this resulted in a bunch of vulnerabilities if I were to store the connection strings and keys as plaintext in settings.py, so I utilized environment variables, GitHub Secrets, and App Service Settings to secure the important bits.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;DJANGO_SECRET_KEY - Used to update SECRET_KEY in production
- Secret = DJANGO_SECRET_KEY
DJANGO_DEBUG - Used to disable debug mode in production
- No Secret
DJONGO_HOST - Used to secure connection string to Cosmos MongoDB
- Secret = DJONGO_HOST        
ACCOUNT_KEY - Used to secure connection key to Azure Storage hosting static files
- Secret = ACCOUNT_KEY
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These can be added as Application Settings via the &lt;a href="https://github.com/Azure/appservice-settings" rel="noopener noreferrer"&gt;azure/appservice-settings&lt;/a&gt; GitHub Action&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;azure/appservice-settings@v1&lt;/span&gt;
    &lt;span class="s"&gt;with&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
     &lt;span class="na"&gt;app-name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ env.AZURE_WEBAPP_NAME }}&lt;/span&gt;
     &lt;span class="na"&gt;mask-inputs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
     &lt;span class="na"&gt;app-settings-json&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;[&lt;/span&gt;
          &lt;span class="s"&gt;{&lt;/span&gt;
            &lt;span class="s"&gt;"name":&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;"ACCOUNT_KEY",&lt;/span&gt;
            &lt;span class="s"&gt;"value":&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;"${{&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;secrets.ACCOUNT_KEY&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;}}",&lt;/span&gt;
            &lt;span class="s"&gt;"slotSetting":&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;false&lt;/span&gt;
          &lt;span class="s"&gt;},&lt;/span&gt;
          &lt;span class="s"&gt;{&lt;/span&gt;
            &lt;span class="s"&gt;"name":&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;"DJANGO_DEBUG",&lt;/span&gt;
            &lt;span class="s"&gt;"value":&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;"False",&lt;/span&gt;
            &lt;span class="s"&gt;"slotSetting":&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;false&lt;/span&gt;
          &lt;span class="s"&gt;},&lt;/span&gt;
          &lt;span class="s"&gt;{&lt;/span&gt;
            &lt;span class="s"&gt;"name":&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;"DJANGO_SECRET_KEY",&lt;/span&gt;
            &lt;span class="s"&gt;"value":&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;"${{&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;secrets.DJANGO_SECRET_KEY&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;}}",&lt;/span&gt;
            &lt;span class="s"&gt;"slotSetting":&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;false&lt;/span&gt;
          &lt;span class="s"&gt;},&lt;/span&gt;
          &lt;span class="s"&gt;{&lt;/span&gt;
            &lt;span class="s"&gt;"name":&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;"DJONGO_HOST",&lt;/span&gt;
            &lt;span class="s"&gt;"value":&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;"${{&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;secrets.DJONGO_HOST&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;}}",&lt;/span&gt;
            &lt;span class="s"&gt;"slotSetting":&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;false&lt;/span&gt;
          &lt;span class="s"&gt;}&lt;/span&gt;
        &lt;span class="s"&gt;]'&lt;/span&gt;
     &lt;span class="na"&gt;general-settings-json&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;{"linuxFxVersion":&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;"PYTHON|${{&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;env.PYTHON_VERSION&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;}}"}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Terraform
&lt;/h2&gt;

&lt;p&gt;I have two main Terraform configurations where I am utilizing the Azure Backend and a Service Principal for authentication, as mentioned earlier.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;global - This is used to define the resource group and storage account used by both the remote state and static files. Using the same storage account for both probably isn't considered a best practice, but I thought it acceptable for a small personal project where everything would likely share the same lifecycle&lt;/li&gt;
&lt;li&gt;web - This defines the infrastructure my Django app needs to run. The Cosmos DB, the static container, and the App Service&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Security
&lt;/h3&gt;

&lt;p&gt;For authentication, I am using a &lt;a href="https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/guides/service_principal_client_secret" rel="noopener noreferrer"&gt;Service Principal&lt;/a&gt; and setting the below environment variables during the workflow run, with their values retrieved from repo secrets.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ARM_CLIENT_ID
ARM_CLIENT_SECRET
ARM_SUBSCRIPTION_ID 
ARM_TENANT_ID 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Pipelines
&lt;/h2&gt;

&lt;p&gt;Just two pipelines at the moment&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;deploy.yml - Not well-named, but is what I used to deploy my web Terraform configuration. I want to thank Facuno Gauna for his wonderful article on deploying &lt;a href="https://gaunacode.com/deploying-terraform-at-scale-with-github-actions" rel="noopener noreferrer"&gt;Terraform with GitHub Actions&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;appcontent.yml - Used to deploy my app content to the Azure App Service. Using &lt;a href="https://github.com/Azure/actions-workflow-samples/blob/master/AppService/python-webapp-on-azure.yml" rel="noopener noreferrer"&gt;this sample&lt;/a&gt; provided by Microsoft, with very minor modifications. &lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Dockerfile
&lt;/h2&gt;

&lt;p&gt;I started creating a Dockerfile with the intention of containerizing, which I still might, but haven't completed due to deciding on Azure App Service hosting instead of a container instance. It really only needs environment variables for settings.py at this point, then publishing somewhere. &lt;/p&gt;

</description>
      <category>beginners</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Cloud Resume Challenge Part #1</title>
      <dc:creator>Cody Meadows</dc:creator>
      <pubDate>Mon, 21 Mar 2022 02:00:50 +0000</pubDate>
      <link>https://dev.to/cmeadowstech/cloud-resume-challenge-part-1-2895</link>
      <guid>https://dev.to/cmeadowstech/cloud-resume-challenge-part-1-2895</guid>
      <description>&lt;p&gt;In my efforts to pick up some skills with Azure, I came across the &lt;a href="https://cloudresumechallenge.dev/docs/the-challenge/azure/"&gt;Cloud Resume Challenge&lt;/a&gt; and thought I'd throw myself at it while studying for the AZ-104. It consists of several sections designed to throw you into various skills used when working in the Azure environment - blob storage, CDNs, programming with JavaScript and Python, and automation with Azure Functions and GitHub Actions.&lt;/p&gt;

&lt;p&gt;I'm already somewhat familiar with HTML and CSS so while I'm not far into my Azure studies, the first few steps were pretty easy to bang out over the weekend. So here I present to you my new resume website - &lt;a href="https://web.cmeadows.tech/"&gt;https://web.cmeadows.tech/&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Certification
&lt;/h2&gt;

&lt;p&gt;I actually got the AZ-900 last summer. I know, a while between then in and but I was busy getting accustomed to my new L2 position. (There is no L3 at my company, next stop Microsoft) This was really an introductory certification anyways, not one that had you doing much of anything.&lt;/p&gt;

&lt;h2&gt;
  
  
  HTML
&lt;/h2&gt;

&lt;p&gt;I've put together a few websites over the years, so a simple static page was pretty quick to get together. It's mostly just some DIVs for each section, and some headers, paragraphs and unordered lists for the contents.&lt;/p&gt;

&lt;p&gt;I mean, look at this. Not trying to undersell my work but I know people would scoff at this being called "web development"&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    &amp;lt;div class="section"&amp;gt;
    &amp;lt;h2 class="head"&amp;gt;Certifications &amp;amp; Skills&amp;lt;/h2&amp;gt;
        &amp;lt;ul&amp;gt;
            &amp;lt;li&amp;gt;&amp;lt;b&amp;gt;Certifications:&amp;lt;/b&amp;gt; CompTIA Network+, MS-900: Microsoft 365 Fundamentals, AZ-900: Microsoft Azure Fundamentals, CompTIA A+&amp;lt;/li&amp;gt;
            &amp;lt;li&amp;gt;&amp;lt;b&amp;gt;Skills:&amp;lt;/b&amp;gt; Office 365, Azure, Active Directory, PowerShell, Exchange, DNS, Cloud Security, Remote Access Software, Windows 10, macOS, Android, Microsoft Office&amp;lt;/li&amp;gt;
        &amp;lt;/ul&amp;gt;
    &amp;lt;/div&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  CSS
&lt;/h2&gt;

&lt;p&gt;This is also pretty straight-forward. I used a basic reset template to normalize everything and made some minor flavor changes. A background image, centered resume, rounded corners, dashed borders. I might add some animation to it if I get bored. &lt;/p&gt;

&lt;h2&gt;
  
  
  Static Website
&lt;/h2&gt;

&lt;p&gt;Getting into the fun stuff. I hadn't really done anything with blob storage when studying for the AZ-900. Azure Static Webpages made this real easy though - enable it and upload your files to the pre-configured $web container.&lt;/p&gt;

&lt;p&gt;I did create a PowerShell script to upload my updated files though so I don't have to putz around in the portal. It seemed like most people used AzCopy for this, but I opted for Set-AzStorageBlobContent instead. Partly because I'm more comfortable with PowerShell, partly because I was worried how AzCopy integrates with GitHub Actions down the line.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Connect-AzAccount       # If you have multiple directories you will need to specify with -Tenant

# Set container and context below

$localfolder = "C:\your path here"
$storageAccount = Get-AzStorageAccount -ResourceGroupName "Your Resource Group" -Name "Name of storage account"
$Context = $storageAccount.context
$ContainerName = '$web'
$Storage = Get-AzStorageBlob -Context $Context -Container '$web'
$files = Get-ChildItem $localfolder

# Replace files if they exist, and upload them if they don't

foreach ($file in $files) {
    $name = $file.name
    $path = "$($localfolder)\$($name)"
    $blob = Get-AzStorageBlob -Container $ContainerName -Context $Context -Blob $name -ErrorAction:SilentlyContinue
    if ($blob -eq $null) {
        Set-AzStorageBlobContent -Container $ContainerName -Context $Context -File $path -Blob $name -Properties @{"ContentType" = [System.Web.MimeMapping]::GetMimeMapping($path)}    # If the file does not currently exist on the container
    } else {
        $blob | Set-AzStorageBlobContent -File $path -Properties @{"ContentType" = [System.Web.MimeMapping]::GetMimeMapping($path)} -Force    # If the file does currently exist on the container
    }
}

# Purge CDN Endpoint
Get-AzCdnProfile | Get-AzCdnEndpoint | Unpublish-AzCdnEndpointContent -PurgeContent "/*"    
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  HTTPS
&lt;/h2&gt;

&lt;p&gt;Configuring the CDN was easy, but setting up HTTPS was more of a pain than it should be in my opinion. I mean, enabling it for subdomains is a flip of a switch but it seems inordinately difficult to do so for a root domain. The only options are to buy an expensive certification from someone like DigiCert or a complex automated process to obtain and renew certificates from the common Let's Encrypt.&lt;/p&gt;

&lt;p&gt;In the future I'll revisit automating Let's Encrypt certificates, but for now I'll opt to use a subdomain for my site.&lt;/p&gt;

&lt;h2&gt;
  
  
  DNS
&lt;/h2&gt;

&lt;p&gt;Azure DNS zones are pretty easy to set up. I had to set my nameservers in NameCheap, but other than that they have an easy gui to point your records to already existing resources like the CDN endpoint I previously configured.&lt;/p&gt;

&lt;h2&gt;
  
  
  In summary
&lt;/h2&gt;

&lt;p&gt;At this point my Resource Group contains the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Storage account with a Static Web Page enabled&lt;/li&gt;
&lt;li&gt;CDN Endpoint&lt;/li&gt;
&lt;li&gt;Front Door and CDN profiles&lt;/li&gt;
&lt;li&gt;DNS Zone&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Hopefully more to add as I start to work through the AZ-104. I still have quite a list to finish up:&lt;/p&gt;

&lt;p&gt;Javascript&lt;br&gt;
Database&lt;br&gt;
API&lt;br&gt;
Python&lt;br&gt;
Tests&lt;br&gt;
Infrastructure as Code&lt;br&gt;
Source Control&lt;br&gt;
CI/CD (Back end)&lt;br&gt;
CI/CD (Front end)&lt;br&gt;
Blog post (Well, here's part #1. I probably won't have part #2 until I finish this thing)&lt;/p&gt;

</description>
      <category>cloudresumechallenge</category>
      <category>azure</category>
      <category>webdev</category>
    </item>
  </channel>
</rss>
