<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Hackers And Slackers</title>
    <description>The latest articles on DEV Community by Hackers And Slackers (@hackersandslackers).</description>
    <link>https://dev.to/hackersandslackers</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Forganization%2Fprofile_image%2F384%2Fb7f462d9-dfbb-4950-b2f8-496940001c8e.png</url>
      <title>DEV Community: Hackers And Slackers</title>
      <link>https://dev.to/hackersandslackers</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/hackersandslackers"/>
    <language>en</language>
    <item>
      <title>Serving Assets via CDN with Google Cloud</title>
      <dc:creator>Todd Birchard</dc:creator>
      <pubDate>Sat, 23 Apr 2022 16:37:03 +0000</pubDate>
      <link>https://dev.to/hackersandslackers/serving-assets-via-cdn-with-google-cloud-1260</link>
      <guid>https://dev.to/hackersandslackers/serving-assets-via-cdn-with-google-cloud-1260</guid>
      <description>&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--dcAKDqo---/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hackersandslackers.com/2022/02/gcp-loadbalancer-1.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--dcAKDqo---/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://cdn.hackersandslackers.com/2022/02/gcp-loadbalancer-1.jpg" alt="Serving Assets via CDN with Google Cloud" width="880" height="507"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Pardon my nostalgia, but it's hard not to miss the good old days of the world-wide-web. Websites consisted of the most straightforward architecture: &lt;strong&gt;single-node servers&lt;/strong&gt;. It's hard to imagine the internet of the 90s, where blogs, forums, and obscure Flash sites were almost exclusively hosted on simple LAMP stacks instead of a myriad of cloud services that AWS has convinced us we need. We didn't need entire VPCs serving as ecosystems to countless microservices. Instead, each of us spun up individual servers. By no coincidence, the vast majority of such US-based servers were located in Texas.&lt;/p&gt;

&lt;p&gt;Cheap energy costs aside, Texas is notably equidistant between the east and west coasts of the United States. Most US citizens live in coastal states; thus, we can expect the majority of traffic to any US-based website to originate on opposite sides of the country.&lt;/p&gt;

&lt;p&gt;Connections to a server need to travel physical distances, and traveling distance takes time, more commonly known as &lt;strong&gt;latency&lt;/strong&gt;. Therefore, as our user base grows and strays further and further from our node in Texas, the average user will experience longer load times when fetching assets from our site.&lt;/p&gt;

&lt;p&gt;For all the magic of the internet, no amount of magic can defy physics to "solve" latency. That said, we can implement something &lt;em&gt;indistinguishable&lt;/em&gt; from magic 😉.&lt;/p&gt;

&lt;h2&gt;
  
  
  CDNs In A Nutshell
&lt;/h2&gt;

&lt;p&gt;The "&lt;em&gt;not quite magic"&lt;/em&gt; solution to our latency problem is a &lt;strong&gt;Content-Delivery Network&lt;/strong&gt;. We can't control where our users are geographically, but we &lt;em&gt;can&lt;/em&gt; control where we serve chunks of our app from. For example, why should users in Asia wait to load images (or other static assets) from our app's server in Texas, when there are quality data centers in Singapore? CDNs are designed to mitigate this very scenario.&lt;/p&gt;

&lt;p&gt;A CDN is a distributed network of machines that spans countries and continents to serve assets closer to the users requesting them. For example, a user in Europe might fetch &lt;strong&gt;Asset X&lt;/strong&gt; from a node in the Netherlands. In contrast, a user in the Philippines might fetch &lt;strong&gt;Asset X&lt;/strong&gt; from an edge node in Manila. Our hypothetical app still resides in Texas, yet neither user had to wait for &lt;strong&gt;Asset X&lt;/strong&gt; to make its way halfway across the world via a transatlantic cable.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--6EdxUVAQ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/03/gcp-cdn-before-copy.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--6EdxUVAQ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/03/gcp-cdn-before-copy.jpg" alt="Serving Assets via CDN with Google Cloud" width="880" height="580"&gt;&lt;/a&gt;&lt;/p&gt;
Visualizing the architecture of a CDN



&lt;h3&gt;
  
  
  CDNs are Worth the Trouble
&lt;/h3&gt;

&lt;p&gt;Performance improvements offered by CDNs have shifted from "borderline excessive" to "absolutely critical" for the success of many (if not most) internet entities. Latency has a measurable impact on the bottom line - you're probably familiar with the &lt;a href="https://glinden.blogspot.com/2006/11/marissa-mayer-at-web-20.html"&gt;Amazon study&lt;/a&gt; that equated &lt;em&gt;100ms&lt;/em&gt; in additional latency to a &lt;em&gt;1% loss&lt;/em&gt; in revenue.&lt;/p&gt;

&lt;p&gt;Depending on your business model (if you have one), an equally (or arguably more significant side effect) is the indirect impact latency has on SEO. Tools like &lt;a href="https://web.dev/"&gt;Google's Lighthouse&lt;/a&gt; have brought excruciating visibility into how serving large uncached assets affects site ranking. The results can be brutal:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--TxV3OCv_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/03/gcp-cdn-lighthouse-1-1.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--TxV3OCv_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/03/gcp-cdn-lighthouse-1-1.jpg" alt="Serving Assets via CDN with Google Cloud" width="880" height="688"&gt;&lt;/a&gt;&lt;/p&gt;
The negative impact of poor performance



&lt;p&gt;By Google's standards, the performance of &lt;strong&gt;Hackersandslackers.com&lt;/strong&gt; is God-awful. Unbeknownst to me, I'd likely been committing atrocities against Google's page rank algorithms for years prior to the availability of Lighthouse. This is particularly unfortunate for a site where Google is the single traffic source.&lt;/p&gt;

&lt;p&gt;Given the topic of this post, it should come as no surprise that these shortcomings are almost always related to CDN issues:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Serving large assets from a centralized location (as opposed to a distributed CDN).&lt;/li&gt;
&lt;li&gt;Assigning proper cache policies to large assets to avoid reloads.&lt;/li&gt;
&lt;li&gt;Serving massive JS/CSS bundles arbitrarily defined by build processes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--B2BSPibL--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/03/gcp-cdn-lighthouse-2.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--B2BSPibL--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/03/gcp-cdn-lighthouse-2.jpg" alt="Serving Assets via CDN with Google Cloud" width="880" height="364"&gt;&lt;/a&gt;&lt;/p&gt;
Large assets lacking efficient cache policies hindering page rank



&lt;h3&gt;
  
  
  What about Cloudflare? Netlify?
&lt;/h3&gt;

&lt;p&gt;"Plug-and-play" CDN services such as these fill the demand for a common (and necessary) niche. Such services have, dare I say, "democratized" CDNs and the performance benefits they yield via an effective 1-2 punch:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Affordable pricing (ranging from &lt;em&gt;free&lt;/em&gt; to a ceiling of perhaps &lt;em&gt;$20/month&lt;/em&gt;).&lt;/li&gt;
&lt;li&gt;Zero-effort implementation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Indeed, any web app can live behind a Cloudflare CDN for zero dollars while requiring zero effort. This is good enough for most people, and I would implore those people to stop reading now and live a happier life by sticking with what's easy.&lt;/p&gt;

&lt;p&gt;Unfortunately for me, these CDN services don't apply to sites heavy on user-generated content. &lt;strong&gt;Hackersandslackers.com&lt;/strong&gt; is primarily generated by &lt;em&gt;editors&lt;/em&gt; (posts containing images), with secondary content coming from &lt;em&gt;users&lt;/em&gt; (comments, personal profiles, avatars). "Content" can be created by nearly anybody at any time. It's almost impossible to cache apps that are constantly changing by nature effectively.&lt;/p&gt;

&lt;h2&gt;
  
  
  Google Cloud CDN Architecture
&lt;/h2&gt;

&lt;p&gt;Google Cloud follows its own unique service architecture that's admittedly a bit unintuitive. &lt;a href="https://cloud.google.com/cdn/docs/overview%5C"&gt;GCP's own documentation&lt;/a&gt; doesn't do much to help this problem, which is precisely what drove me to write this piece in the first place.&lt;/p&gt;

&lt;p&gt;Setting up a proper Google Cloud CDN requires four separate GCP services:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;a href="https://cloud.google.com/cdn"&gt;Cloud DNS&lt;/a&gt;: Safely look past the obnoxious inclusion of the word "Cloud" here (Google marketing be damned). This "service" involves DNS configuration for a given domain, such as adding A/AAAA records. This allows us to serve assets from a custom domain (such as &lt;strong&gt;cdn.hackersandslackers.com&lt;/strong&gt; , in my case).&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://cloud.google.com/load-balancing"&gt;Load Balancer&lt;/a&gt;: It feels a bit excessive, but GCP forces us to handle the routing of a &lt;strong&gt;frontend&lt;/strong&gt; (our DNS) to a &lt;strong&gt;backend&lt;/strong&gt; (storage bucket) via a load balancer.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://cloud.google.com/storage"&gt;Cloud Storage&lt;/a&gt;: Google's equivalent to an S3 bucket. This is where we'll be storing assets to serve our users.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://cloud.google.com/cdn/docs/overview"&gt;Cloud CDN&lt;/a&gt;: When we configure our load balancer to point to a Cloud Storage bucket, we receive the option to enable "Cloud CDN." This is what tells Google to serve assets from edges in its global network and allows us to set caching policies.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The final product looks something like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--vSBYD3wt--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/04/gcp_loadbalancer-1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--vSBYD3wt--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/04/gcp_loadbalancer-1.png" alt="Serving Assets via CDN with Google Cloud" width="880" height="524"&gt;&lt;/a&gt;&lt;/p&gt;
Architecture of a CDN on Google Cloud



&lt;p&gt;We'll tackle this one step at a time. Before we can configure our &lt;strong&gt;Load Balancer&lt;/strong&gt;, we'll need our &lt;strong&gt;Cloud Storage&lt;/strong&gt; backend and &lt;strong&gt;Cloud DNS&lt;/strong&gt; frontends in place. We'll start with the former.&lt;/p&gt;

&lt;h2&gt;
  
  
  Create a GCS Bucket
&lt;/h2&gt;

&lt;p&gt;While logged into the Google Cloud console, go to your &lt;a href="https://console.cloud.google.com/storage/browser"&gt;bucket browser&lt;/a&gt;. Here we'll create a new storage bucket via the self-explanatory "Create Bucket" button:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--XDX__XhJ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/03/gcp-cdn-gcs1.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--XDX__XhJ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/03/gcp-cdn-gcs1.jpg" alt="Serving Assets via CDN with Google Cloud" width="880" height="264"&gt;&lt;/a&gt;&lt;/p&gt;
Creating a new Google Cloud storage bucket



&lt;p&gt;Give your bucket a name and you're good to go:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--CUnnLi06--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/03/gcp-cdn-gcs2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--CUnnLi06--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/03/gcp-cdn-gcs2.png" alt="Serving Assets via CDN with Google Cloud" width="880" height="832"&gt;&lt;/a&gt;&lt;/p&gt;
Configure your bucket &amp;amp; keep it simple



&lt;h2&gt;
  
  
  Configure Cloud DNS
&lt;/h2&gt;

&lt;p&gt;I'll assume you have a domain in your possession from which to serve assets (this can be a subdomain, such as &lt;code&gt;cdn.[YOUR_DOMAIN].com&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;Navigate to the &lt;a href="https://cloud.google.com/cdn"&gt;Cloud DNS&lt;/a&gt; console and &lt;strong&gt;create a new zone&lt;/strong&gt;, as such:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--opyfZQtW--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/03/gcp_cdn_clouddns_1.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--opyfZQtW--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/03/gcp_cdn_clouddns_1.jpg" alt="Serving Assets via CDN with Google Cloud" width="880" height="414"&gt;&lt;/a&gt;&lt;/p&gt;
Add a DNS zone to serve assets from a vanity domain



&lt;p&gt;This will prompt you to &lt;em&gt;"Create a DNS zone."&lt;/em&gt; This is straightforward:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Set &lt;strong&gt;Zone type&lt;/strong&gt; is set to &lt;em&gt;Public&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;Enter a &lt;strong&gt;Zone name&lt;/strong&gt; (this can be anything you please; it is for internal use only).&lt;/li&gt;
&lt;li&gt;Set &lt;strong&gt;DNS name&lt;/strong&gt; to the domain in your possession from which all assets will be hosted.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--z13v_Rzx--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/04/gcp_cdn_clouddns_3.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--z13v_Rzx--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/04/gcp_cdn_clouddns_3.jpg" alt="Serving Assets via CDN with Google Cloud" width="880" height="1035"&gt;&lt;/a&gt;&lt;/p&gt;
Create a public DNS zone for your CDN's domain



&lt;p&gt;Your newly created &lt;strong&gt;DNS zone&lt;/strong&gt; should be visible from the Cloud DNS console:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--nL-qMkkC--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/04/gcp_cdn_clouddns_2.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--nL-qMkkC--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/04/gcp_cdn_clouddns_2.jpg" alt="Serving Assets via CDN with Google Cloud" width="880" height="513"&gt;&lt;/a&gt;&lt;/p&gt;
Your DNS Zone details



&lt;p&gt;Google should provide you with 4 nameservers. Ensure your domain is pointed to these nameservers via whichever provider you purchased your domain from:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--q7m0TpSF--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/04/gcp_cdn_clouddns_4.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--q7m0TpSF--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/04/gcp_cdn_clouddns_4.jpg" alt="Serving Assets via CDN with Google Cloud" width="880" height="940"&gt;&lt;/a&gt;&lt;/p&gt;
Point your domain NS records to the provided nameservers



&lt;h3&gt;
  
  
  Create &amp;amp; Configure a Load Balancer
&lt;/h3&gt;

&lt;p&gt;Now that we've set up our " &lt;strong&gt;frontend&lt;/strong&gt;" and " &lt;strong&gt;backend&lt;/strong&gt;," we can turn our attention to the meat and potatoes of our architecture: the &lt;strong&gt;Load Balancer&lt;/strong&gt;. Navigate to GCP's &lt;a href="https://console.cloud.google.com/net-services/loadbalancing/list/loadBalancers"&gt;Load Balancing&lt;/a&gt; console:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--5KIRvGCP--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/03/gcp_cdn_loadbalancing_1.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--5KIRvGCP--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/03/gcp_cdn_loadbalancing_1.jpg" alt="Serving Assets via CDN with Google Cloud" width="880" height="420"&gt;&lt;/a&gt;&lt;/p&gt;
Navigate to "Load Balancing" under "Network Services"



&lt;p&gt;Here we'll &lt;strong&gt;Create a new Load Balancer&lt;/strong&gt; :&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--K6yczrtp--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/03/gcp_cdn_clouddns_2.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--K6yczrtp--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/03/gcp_cdn_clouddns_2.jpg" alt="Serving Assets via CDN with Google Cloud" width="880" height="429"&gt;&lt;/a&gt;&lt;/p&gt;
Create a new load balancer



&lt;p&gt;We'll be serving assets over the internet (HTTP/HTTPS), so go ahead and start configuration for &lt;strong&gt;HTTP(S) Load Balancing&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--dqRluvmI--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/03/gcp_cdn_loadbalancing_3.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--dqRluvmI--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/03/gcp_cdn_loadbalancing_3.jpg" alt="Serving Assets via CDN with Google Cloud" width="880" height="1030"&gt;&lt;/a&gt;&lt;/p&gt;
Create an HTTP(S) Load Balancer



&lt;p&gt;The next step is self-explanatory; we want an &lt;strong&gt;Internet-facing&lt;/strong&gt; , &lt;strong&gt;Classic HTTP(S) Load Balancer&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--jlEwTtkY--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/03/gcp_cdn_loadbalancing_be_1.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--jlEwTtkY--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/03/gcp_cdn_loadbalancing_be_1.jpg" alt="Serving Assets via CDN with Google Cloud" width="880" height="531"&gt;&lt;/a&gt;&lt;/p&gt;
Specify internet-facing load balancer



&lt;p&gt;The foundation of our load balancer is in place! Let's make our load balancer to do what load balancers do: distribute incoming traffic.&lt;/p&gt;

&lt;p&gt;"Incoming Traffic," in this sense, is any request made to the domain associated with our &lt;strong&gt;Cloud DNS&lt;/strong&gt;  &lt;strong&gt;frontend&lt;/strong&gt;. From a user's perspective, it's as though we're hosting images and such on a domain separate from our application. Behind the scenes, the domain isn't a host so much as a traffic cop.&lt;/p&gt;

&lt;p&gt;Depending on where the user resides, they'll fetch assets from whichever "host" offers the best connection for them, where said host is a replica of &lt;strong&gt;Cloud Storage&lt;/strong&gt;  &lt;strong&gt;backend&lt;/strong&gt;. Thankfully, only a few steps remain for us to achieve this sorcery.&lt;/p&gt;

&lt;h3&gt;
  
  
  I. Backend Configuration
&lt;/h3&gt;

&lt;p&gt;We have a load balancer, but it doesn't do anything just yet. We need to tell the load balancer &lt;em&gt;what&lt;/em&gt; the end "destination" is for incoming requests. Our options are:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Backend Service&lt;/strong&gt; : Think servers, cloud-hosted applications, microservices, etc. Basically, things we &lt;em&gt;don't&lt;/em&gt; want.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Backend Bucket&lt;/strong&gt; : Indeed, this is how Google's product team chose to represent CDNs on their platform. As far as GCP is concerned, a CDN is just anther service to be served behind a load balancer (and yet it isn't, hence a separate option, but whatever). This is the option we want.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--wCC-TQD8--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/03/gcp_cdn_loadbalancing_be_2-1.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--wCC-TQD8--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/03/gcp_cdn_loadbalancing_be_2-1.jpg" alt="Serving Assets via CDN with Google Cloud" width="880" height="420"&gt;&lt;/a&gt;&lt;/p&gt;
Specify Backend Bucket as your load balancer's backend



&lt;p&gt;&lt;em&gt;Now we're actually getting somewhere&lt;/em&gt;. This is perhaps the most important step of all: tuning the configuration of our load balancers "backend bucket." On this step, we want to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Name our backend bucket&lt;/strong&gt;. Yes, your bucket already has a name - &lt;em&gt;this&lt;/em&gt; name is intended to represent the relationship said bucket has to your load balancer. Look, the process is really, really dumb. But the end result is great.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Select which GCS bucket to point to&lt;/strong&gt;. Select the bucket you created earlier via the "browse" button.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ENABLE CLOUD CDN&lt;/strong&gt;! This unsuspecting checkbox is responsible for the magic that hosts our bucket over a distributed network. It's essentially the entire reason we're here... CLICK IT.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Set cache policies&lt;/strong&gt;. I could drone on about which &lt;em&gt;Time To Live (TTL)&lt;/em&gt; settings might give you the best shot at convincing Google to rank your page higher. Instead, I implore you to mimic something similar to what I have below, and encourage you to experiment and make your own tweaks once you get comfortable:&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--nhfnmGFw--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/03/gcp_cdn_loadbalancing_be_3.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--nhfnmGFw--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/03/gcp_cdn_loadbalancing_be_3.jpg" alt="Serving Assets via CDN with Google Cloud" width="880" height="934"&gt;&lt;/a&gt;&lt;/p&gt;
Configure the cache settings for your CDN



&lt;h3&gt;
  
  
  II. Host &amp;amp; Path Rules
&lt;/h3&gt;

&lt;p&gt;Leave &lt;strong&gt;Simple host and path rule&lt;/strong&gt; selected. All we need to do is select the &lt;strong&gt;backend&lt;/strong&gt; we created in the previous step from the &lt;code&gt;Backend 1&lt;/code&gt; dropdown:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--PpSphvMf--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/04/gcp_cdn_loadbalancer_hostpath.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--PpSphvMf--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/04/gcp_cdn_loadbalancer_hostpath.jpg" alt="Serving Assets via CDN with Google Cloud" width="880" height="383"&gt;&lt;/a&gt;&lt;/p&gt;
Point your load balancer to your backend bucket



&lt;h3&gt;
  
  
  III. Frontend Configuration
&lt;/h3&gt;

&lt;p&gt;Almost there, folks. We've set up a backend bucket to host assets, configured a distributed CDN from said backend bucket, and configured a load balancer to point to said CDN. All that remains is accepting incoming traffic for our load balancer to distribute.&lt;/p&gt;

&lt;p&gt;The domain associated with our &lt;strong&gt;Cloud DNS&lt;/strong&gt; will serve as our source of traffic. Go ahead and &lt;strong&gt;Add Frontend IP And Port&lt;/strong&gt; :&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--p4-WdSY8--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/04/gcp_cdn_loadbalancer_fe_1.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--p4-WdSY8--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/04/gcp_cdn_loadbalancer_fe_1.jpg" alt="Serving Assets via CDN with Google Cloud" width="880" height="369"&gt;&lt;/a&gt;&lt;/p&gt;
Map an IP address to your backend bucket



&lt;p&gt;This will expand a configuration menu for accepting traffic on a given hostname. We'll be creating configurations to accept traffic on via &lt;em&gt;HTTP&lt;/em&gt; (port 80) and &lt;em&gt;HTTPS&lt;/em&gt; (port 443).&lt;/p&gt;

&lt;p&gt;Before we do anything, we need to reserve an IP address. Click the &lt;strong&gt;IP Address&lt;/strong&gt; dropdown:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--tEVB3_Xb--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/04/gcp_cdn_loadbalancer_fe_2.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--tEVB3_Xb--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/04/gcp_cdn_loadbalancer_fe_2.jpg" alt="Serving Assets via CDN with Google Cloud" width="880" height="807"&gt;&lt;/a&gt;&lt;/p&gt;
Create your first HTTP/HTTPS frontend configuration



&lt;p&gt;This simple menu will have a single option to &lt;strong&gt;Create IP Address&lt;/strong&gt; :&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--8WJU09H3--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/04/gcp_cdn_loadbalancer_fe_3.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--8WJU09H3--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/04/gcp_cdn_loadbalancer_fe_3.jpg" alt="Serving Assets via CDN with Google Cloud" width="880" height="857"&gt;&lt;/a&gt;&lt;/p&gt;
Reserve and assign a static IP address



&lt;p&gt;Go ahead and wrap up this frontend config with the default values in place: &lt;strong&gt;Protocol&lt;/strong&gt; set to &lt;code&gt;HTTP&lt;/code&gt;, &lt;strong&gt;IP Version&lt;/strong&gt; as &lt;code&gt;IPv4&lt;/code&gt;, and &lt;strong&gt;Port&lt;/strong&gt; left as &lt;code&gt;80&lt;/code&gt;. Click &lt;strong&gt;Done&lt;/strong&gt; to wrap up the first of our two frontend configs:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--BLsTnoFZ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/04/gcp_cdn_loadbalancer_fe_4.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--BLsTnoFZ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/04/gcp_cdn_loadbalancer_fe_4.jpg" alt="Serving Assets via CDN with Google Cloud" width="880" height="770"&gt;&lt;/a&gt;&lt;/p&gt;
Wrap up your HTTP frontend configuration



&lt;p&gt;For our second config (HTTPS), we'll repeat the process with a few notable differences; &lt;strong&gt;Protocol&lt;/strong&gt; should now be &lt;code&gt;HTTPS&lt;/code&gt;, and &lt;strong&gt;Port&lt;/strong&gt; set to &lt;code&gt;443&lt;/code&gt;, and most notably, specify an SSL &lt;strong&gt;Certificate&lt;/strong&gt;. Google Cloud makes acquiring SSL certs easy via a dropdown.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Create A New Certificate&lt;/strong&gt; here, and we're DONE:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--u3jXkswN--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/04/gcp_cdn_loadbalancer_fe_5-1.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--u3jXkswN--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/04/gcp_cdn_loadbalancer_fe_5-1.jpg" alt="Serving Assets via CDN with Google Cloud" width="880" height="972"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;OH GOD YES ITS FINALLY OVER. I know I told you everything was going to be okay, but deep down I wasn't sure we'd finally get here. Hell, I wouldn't read a tutorial this long... but apparently I'd write one.&lt;/p&gt;

&lt;p&gt;Oh yeah, we haven't hit the &lt;strong&gt;Confirm and Finalize&lt;/strong&gt; button yet.&lt;/p&gt;

&lt;h3&gt;
  
  
  IV. Confirm &amp;amp; Finalize
&lt;/h3&gt;

&lt;p&gt;PRESS THE UPDATE BUTTON FOR GREAT JUSTICE:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--2PRu7tbq--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/04/gcp_cdn_loadbalancer_finalize.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--2PRu7tbq--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/04/gcp_cdn_loadbalancer_finalize.jpg" alt="Serving Assets via CDN with Google Cloud" width="880" height="463"&gt;&lt;/a&gt;&lt;/p&gt;
Bask in the glory of what you've created



&lt;p&gt;And NOW we've done it! Christ... it's a good thing Google Cloud can't design UIs or write documentation to save their lives. Otherwise, I'd be out of business.&lt;/p&gt;

&lt;p&gt;Take a breath of fresh air. You've made it, and your CDN is in place. After you take a moment to regroup, you may be wondering how to make sure your CDN is actually doing what we think its doing. I'll point you in the right direction.&lt;/p&gt;

&lt;h2&gt;
  
  
  Monitoring Your CDN in Action
&lt;/h2&gt;

&lt;p&gt;There are two ways to visualize our CDN in action. First is by monitoring our load balancer.&lt;/p&gt;

&lt;h3&gt;
  
  
  Load Balancer Traffic
&lt;/h3&gt;

&lt;p&gt;"Monitoring" our load balancer gives us a cool visual of &lt;em&gt;where&lt;/em&gt; incoming traffic is coming from, and how the traffic is being routed:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--8Y0UtmdM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/04/gcp_cdn_loadbalancer_monitoring_1.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--8Y0UtmdM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/04/gcp_cdn_loadbalancer_monitoring_1.jpg" alt="Serving Assets via CDN with Google Cloud" width="880" height="560"&gt;&lt;/a&gt;&lt;/p&gt;
Navigate to the "Monitoring" tab of your load balancer



&lt;p&gt;Using my own load balancer as an example, the below graphic shows incoming traffic requests broken out by continent:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--jln_Vtrk--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/04/gcp_cdn_loadbalancer_monitoring_2-1.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--jln_Vtrk--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/04/gcp_cdn_loadbalancer_monitoring_2-1.jpg" alt="Serving Assets via CDN with Google Cloud" width="880" height="796"&gt;&lt;/a&gt;&lt;/p&gt;
Visualize how traffic being routed through your load balancer



&lt;p&gt;There are some key takeaways from what we're seeing here. First, nearly all traffic is being served over HTTPS, even including requests which were attempted over HTTP. These requests are successfully being forced over a secure connection.&lt;/p&gt;

&lt;p&gt;At the bottom of my chart, I see &lt;em&gt;most&lt;/em&gt; assets are being served from my CDN's cache (seen as &lt;code&gt;SERVED_FROM_CACHE&lt;/code&gt;). This is what we like to see, but perhaps there's room for improvement in my caching policies to ensure this is the case more often.&lt;/p&gt;

&lt;h3&gt;
  
  
  CDN Bandwidth &amp;amp; Hit Rate
&lt;/h3&gt;

&lt;p&gt;We can also monitor our CDN itself. While staying in the "Networking services" tab of Google Cloud, our   &lt;strong&gt;Cloud CDN&lt;/strong&gt; can be found via the left-hand menu. We only have a single CDN, so let's check it out:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--6nqyn33O--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/04/gcp_cdn_loadbalancer_monitoring_3.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--6nqyn33O--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/04/gcp_cdn_loadbalancer_monitoring_3.jpg" alt="Serving Assets via CDN with Google Cloud" width="880" height="505"&gt;&lt;/a&gt;&lt;/p&gt;
Navigate to the "Cloud CDN" tab and select your backend bucket



&lt;p&gt;Smash that &lt;strong&gt;Monitoring&lt;/strong&gt; button to check out how your CDN is performing over time with metrics such as &lt;em&gt;bandwidth&lt;/em&gt;, &lt;em&gt;hit rate&lt;/em&gt;, number of &lt;em&gt;requests&lt;/em&gt;, and more:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--ru-Iwa-y--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/04/gcp_cdn_loadbalancer_monitoring_4-1.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--ru-Iwa-y--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2022/04/gcp_cdn_loadbalancer_monitoring_4-1.jpg" alt="Serving Assets via CDN with Google Cloud" width="880" height="1101"&gt;&lt;/a&gt;&lt;/p&gt;
Cloud CDN monitoring



&lt;h2&gt;
  
  
  Parting Thoughts
&lt;/h2&gt;

&lt;p&gt;Well folks, it's been fun (by which I of course mean the opposite). So, &lt;em&gt;was&lt;/em&gt; it worth the effort after-all? In my personal case, &lt;strong&gt;web.dev&lt;/strong&gt; bumped my site's performance ranking to 80+, which is a substantial increase. Considering the load times this saves my readers (as well as the SEO bump for not sucking), the process and costs of serving assets via CDN are certainly worth it.&lt;/p&gt;

&lt;p&gt;Was it worth writing this tutorial about a niche service on a relatively small cloud provider? Almost certainly not. Unless you enjoyed it, of course.&lt;/p&gt;

&lt;p&gt;See you next time.&lt;/p&gt;

</description>
      <category>googlecloud</category>
      <category>architecture</category>
      <category>devops</category>
      <category>software</category>
    </item>
    <item>
      <title>Async HTTP Requests with Aiohttp &amp; Aiofiles</title>
      <dc:creator>Todd Birchard</dc:creator>
      <pubDate>Wed, 02 Feb 2022 09:10:00 +0000</pubDate>
      <link>https://dev.to/hackersandslackers/async-http-requests-with-aiohttp-aiofiles-noa</link>
      <guid>https://dev.to/hackersandslackers/async-http-requests-with-aiohttp-aiofiles-noa</guid>
      <description>&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.hackersandslackers.com%2F2021%2F09%2Faiohttp2-3.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.hackersandslackers.com%2F2021%2F09%2Faiohttp2-3.jpg" alt="Async HTTP Requests with Aiohttp &amp;amp; Aiofiles"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When building applications within the confines of a single-threaded, synchronous language, the limitations become very obvious very quickly. The first thing that comes to mind is &lt;strong&gt;writes&lt;/strong&gt;: the very definition of an I/O-bound task. When writing data to files (or databases), each "write" action intentionally occupies a thread until the write is complete. This makes a lot of sense for ensuring data integrity in most systems. For example, if two operations simultaneously attempt to update a database record, which one is correct? Alternatively, if a script requires an HTTP request to succeed before continuing, how could we move on until we know the request succeeded?&lt;/p&gt;

&lt;p&gt;HTTP requests are among the most common thread-blocking operations. When we write scripts that expect data from an external third party, we introduce a myriad of uncertainties that can only be answered by the request itself, such as response time latency, the nature of data we expect to receive, or if the request will succeed. Even when working with APIs we're confident in, no operation is sure to succeed until it's complete. Hence, we're "blocked."&lt;/p&gt;

&lt;p&gt;As applications grow in complexity to support more simultaneous user interactions, software is moving away from the paradigm of being executed linearly. So while we might not be sure that a specific request succeeds or a database write is completed, this can be acceptable as long as we have ways to handle and mitigate these issues gracefully.&lt;/p&gt;

&lt;h2&gt;
  
  
  A Problem Worthy of Asynchronous Execution
&lt;/h2&gt;

&lt;p&gt;How long do you suppose it would take a Python script to execute a few hundred HTTP requests, parse each response, and write the output to a single file? If you were to use requests in a simple for loop, you'd need to wait a fair amount of time for Python to execute each request, open a file, write to it, close it, and move on to the next.&lt;/p&gt;

&lt;p&gt;Let's put &lt;strong&gt;asyncio's&lt;/strong&gt; ability to improve script efficiency to an actual test. We'll execute two I/O-blocking actions per task for a few hundred URLs: executing and parsing an HTTP request and writing the desired result to a single file. The &lt;em&gt;input&lt;/em&gt; for our experiment will be a ton of URLs, with the expected &lt;em&gt;output&lt;/em&gt; to be metadata parsed from those URLs. Let's see how long it takes to do this for hundreds of URLs.&lt;/p&gt;

&lt;p&gt;This site has &lt;a href="https://hackersandslackers.com/sitemap-posts.xml" rel="noopener noreferrer"&gt;roughly two hundred&lt;/a&gt; published posts of its own, which makes it a great guinea pig for this little experiment. I've created a CSV that contains the URLs to these posts, which will be our input. Here's a sneak peek below:&lt;/p&gt;

&lt;h3&gt;
  
  
  Sample Input
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;url&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://hackersandslackers.com/intro-to-asyncio-concurrency/" rel="noopener noreferrer"&gt;https://hackersandslackers.com/intro-to-asyncio-concurrency/&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://hackersandslackers.com/multiple-python-versions-ubuntu-20-04/" rel="noopener noreferrer"&gt;https://hackersandslackers.com/multiple-python-versions-ubuntu-20-04/&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://hackersandslackers.com/google-bigquery-python/" rel="noopener noreferrer"&gt;https://hackersandslackers.com/google-bigquery-python/&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://hackersandslackers.com/plotly-chart-studio/" rel="noopener noreferrer"&gt;https://hackersandslackers.com/plotly-chart-studio/&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://hackersandslackers.com/deploy-serverless-golang-functions-with-netlify/" rel="noopener noreferrer"&gt;https://hackersandslackers.com/deploy-serverless-golang-functions-with-netlify/&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://hackersandslackers.com/scrape-metadata-json-ld/" rel="noopener noreferrer"&gt;https://hackersandslackers.com/scrape-metadata-json-ld/&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://hackersandslackers.com/terraform-with-google-cloud/" rel="noopener noreferrer"&gt;https://hackersandslackers.com/terraform-with-google-cloud/&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://hackersandslackers.com/deploy-golang-app-nginx/" rel="noopener noreferrer"&gt;https://hackersandslackers.com/deploy-golang-app-nginx/&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://hackersandslackers.com/4-ways-to-improve-your-plotly-graphs/" rel="noopener noreferrer"&gt;https://hackersandslackers.com/4-ways-to-improve-your-plotly-graphs/&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://hackersandslackers.com/create-your-first-golang-app/" rel="noopener noreferrer"&gt;https://hackersandslackers.com/create-your-first-golang-app/&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;...etc&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
Input CSV



&lt;h3&gt;
  
  
  Sample Output
&lt;/h3&gt;

&lt;p&gt;For each URL found in our input CSV, our script will fetch the URL, parse the page, and write some choice data to a single CSV. The result will resemble the below example:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;title&lt;/th&gt;
&lt;th&gt;description&lt;/th&gt;
&lt;th&gt;primary_tag&lt;/th&gt;
&lt;th&gt;url&lt;/th&gt;
&lt;th&gt;published_at&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Intro to Asynchronous Python with Asyncio&lt;/td&gt;
&lt;td&gt;Execute multiple tasks concurrently in Python with Asyncio: Python`s built-in async library.&lt;/td&gt;
&lt;td&gt;Python&lt;/td&gt;
&lt;td&gt;&lt;a href="https://hackersandslackers.com/intro-to-asyncio-concurrency/" rel="noopener noreferrer"&gt;https://hackersandslackers.com/intro-to-asyncio-concurrency/&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;2022-01-04&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Deploy Serverless Golang Functions with Netlify&lt;/td&gt;
&lt;td&gt;Write and deploy Golang Lambda Functions to your GatsbyJS site on Netlify.&lt;/td&gt;
&lt;td&gt;JAMStack&lt;/td&gt;
&lt;td&gt;&lt;a href="https://hackersandslackers.com/deploy-serverless-golang-functions-with-netlify/" rel="noopener noreferrer"&gt;https://hackersandslackers.com/deploy-serverless-golang-functions-with-netlify/&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;2020-08-02&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SSH &amp;amp; SCP in Python with Paramiko&lt;/td&gt;
&lt;td&gt;Automate remote server tasks by using the Paramiko &amp;amp; SCP Python libraries. Use Python to SSH into hosts; execute tasks; transfer files; etc.&lt;/td&gt;
&lt;td&gt;Python&lt;/td&gt;
&lt;td&gt;&lt;a href="https://hackersandslackers.com/automate-ssh-scp-python-paramiko/" rel="noopener noreferrer"&gt;https://hackersandslackers.com/automate-ssh-scp-python-paramiko/&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;2020-01-03&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Create Cloud-hosted Charts with Plotly Chart Studio&lt;/td&gt;
&lt;td&gt;Use Pandas and Plotly to create cloud-hosted data visualizations on-demand in Python.&lt;/td&gt;
&lt;td&gt;Plotly&lt;/td&gt;
&lt;td&gt;&lt;a href="https://hackersandslackers.com/plotly-chart-studio/" rel="noopener noreferrer"&gt;https://hackersandslackers.com/plotly-chart-studio/&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;2020-09-03&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Create Your First Golang App&lt;/td&gt;
&lt;td&gt;Set up a local Golang environment and learn the basics to create and publish your first &lt;code&gt;Hello world&lt;/code&gt; app.&lt;/td&gt;
&lt;td&gt;Golang&lt;/td&gt;
&lt;td&gt;&lt;a href="https://hackersandslackers.com/create-your-first-golang-app/" rel="noopener noreferrer"&gt;https://hackersandslackers.com/create-your-first-golang-app/&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;2020-05-25&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Creating Interactive Views in Django&lt;/td&gt;
&lt;td&gt;Create interactive user experiences by writing Django views to handle dynamic content; submitting forms; and interacting with data.&lt;/td&gt;
&lt;td&gt;Django&lt;/td&gt;
&lt;td&gt;&lt;a href="https://hackersandslackers.com/creating-django-views/" rel="noopener noreferrer"&gt;https://hackersandslackers.com/creating-django-views/&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;2020-04-23&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Define Relationships Between SQLAlchemy Data Models&lt;/td&gt;
&lt;td&gt;SQLAlchemy`s ORM easily defines data models with relationships such as one-to-one; one-to-many; and many-to-many relationships.&lt;/td&gt;
&lt;td&gt;SQLAlchemy&lt;/td&gt;
&lt;td&gt;&lt;a href="https://hackersandslackers.com/sqlalchemy-data-models/" rel="noopener noreferrer"&gt;https://hackersandslackers.com/sqlalchemy-data-models/&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;2019-07-11&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;...etc&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
Example of what our script will output



&lt;h2&gt;
  
  
  Tools For The Job
&lt;/h2&gt;

&lt;p&gt;We're going to need three core Python libraries to pull this off:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://docs.python.org/3/library/asyncio.html" rel="noopener noreferrer"&gt;&lt;strong&gt;Asyncio&lt;/strong&gt;&lt;/a&gt;: Python's bread-and-butter library for running asynchronous IO-bound tasks. The library has somewhat built itself into the Python core language, introducing &lt;strong&gt;async/await&lt;/strong&gt; keywords that denote when a function is run asynchronously and when to wait on such a function (respectively).&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://docs.aiohttp.org/en/stable/" rel="noopener noreferrer"&gt;&lt;strong&gt;Aiohttp&lt;/strong&gt;&lt;/a&gt;: When used on the client-side, similar to Python's &lt;strong&gt;requests&lt;/strong&gt; library for making asynchronous requests. Alternatively, &lt;strong&gt;aiohttp&lt;/strong&gt; can be used inversely: as an application webserver to &lt;em&gt;handle&lt;/em&gt; incoming requests &amp;amp; serving responses, but that's a tale for another time.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/Tinche/aiofiles" rel="noopener noreferrer"&gt;&lt;strong&gt;Aiofiles&lt;/strong&gt;&lt;/a&gt;: Makes writing to disk (such as creating and writing bytes to files) a non-blocking task, such that multiple writes can happen on the same thread without blocking one another - even when multiple tasks are bound to the same file.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;asyncio aiohttp aiofiles
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Install the necessary libraries



&lt;h3&gt;
  
  
  BONUS: Dependencies to Optimize Speed
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;aiohttp&lt;/strong&gt; can execute requests &lt;em&gt;even faster&lt;/em&gt; by simply installing a few supplemental libraries. These libraries are &lt;a href="https://pypi.org/project/cchardet/" rel="noopener noreferrer"&gt;cchardet&lt;/a&gt; (character encoding detection), &lt;a href="https://pypi.org/project/aiodns/" rel="noopener noreferrer"&gt;aiodns&lt;/a&gt; (asynchronous DNS resolution), and &lt;a href="https://pypi.org/project/brotlipy/" rel="noopener noreferrer"&gt;brotlipy&lt;/a&gt; (lossless compression).  I'd highly recommend installing these using the conveniently provided shortcut below (take it from me, I'm a stranger on the internet):&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;aiohttp[speedups]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Install supplemental dependencies to speed up requests



&lt;h2&gt;
  
  
  Preparing an Asynchronous Script/Application
&lt;/h2&gt;

&lt;p&gt;We're going to structure this script like any other Python script. Our main module, &lt;strong&gt;aiohttp_aiofiles_tutorial&lt;/strong&gt; will handle all of our logic. &lt;strong&gt;config.py&lt;/strong&gt; and &lt;strong&gt;main.py&lt;/strong&gt; both live outside the main module, and offer our script some &lt;a href="https://github.com/hackersandslackers/aiohttp-aiofiles-tutorial/blob/master/config.py" rel="noopener noreferrer"&gt;basic configuration&lt;/a&gt; and an &lt;a href="https://github.com/hackersandslackers/aiohttp-aiofiles-tutorial/blob/master/main.py" rel="noopener noreferrer"&gt;entry point&lt;/a&gt; respectively:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/aiohttp-aiofiles-tutorial
├── /aiohttp_aiofiles_tutorial
│   ├── __init__.py
│   ├── fetcher.py
│   ├── loops.py
│   ├── tasks.py
│   ├── parser.py
│   └── /data &lt;span class="c"&gt;# Source data&lt;/span&gt;
│   ├── __init__.py
│      ├── parser.py
│      ├── tests
│      └── urls.csv
├── /export &lt;span class="c"&gt;# Destination for exported data&lt;/span&gt;
├── config.py
├── logger.py
├── main.py
├── pyproject.toml
├── Makefile
├── README.md
└── requirements.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Project structure of our async fetcher/writer




&lt;p&gt;&lt;strong&gt;/export&lt;/strong&gt; is simply an empty directory where we'll write our output file to.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;/data&lt;/strong&gt; submodule contains the input CSV mentioned above, and some basic logic to parse it. Not much to phone home about, but if you're curious the source is available on &lt;a href="https://github.com/hackersandslackers/aiohttp-aiofiles-tutorial/tree/master/aiohttp_aiofiles_tutorial/data" rel="noopener noreferrer"&gt;the Github repo&lt;/a&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  Kicking Things Off
&lt;/h3&gt;

&lt;p&gt;With sleeves rolled high, we start with the obligatory script "entry point," &lt;strong&gt;main.py&lt;/strong&gt;. This initiates the core function in &lt;strong&gt;/aiohttp_aiofiles_tutorial&lt;/strong&gt; called &lt;code&gt;init_script()&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Script entry point.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;aiohttp_aiofiles_tutorial&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;init_script&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; __main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;init_script&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
main.py




&lt;p&gt;This seems like we're running a single function/coroutine &lt;code&gt;init_script()&lt;/code&gt; via &lt;code&gt;asyncio.run()&lt;/code&gt;, which seems counter-intuitive at first glance. Isn't the point of asyncio to run &lt;em&gt;multiple&lt;/em&gt; coroutines concurrently, you ask?&lt;/p&gt;

&lt;p&gt;Indeed it is! &lt;code&gt;init_script()&lt;/code&gt; is a coroutine that calls other coroutines. Some of these coroutines create tasks out of other coroutines, others execute them, etc. &lt;code&gt;asyncio.run()&lt;/code&gt; creates an event loop that &lt;em&gt;doesn't stop running&lt;/em&gt; until the target coroutine is done, including all the coroutines that the parent coroutines calls. So, if we keep things clean, &lt;code&gt;asyncio.run()&lt;/code&gt; is a one-time call to initialize a script.&lt;/p&gt;
&lt;h3&gt;
  
  
  Initializing Our Script
&lt;/h3&gt;

&lt;p&gt;Here's where the fun begins. We've established that the purpose of our script is to output a single CSV file, and that's where we'll start: by creating and opening an output file within the context of which our entire script will operate:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Make hundreds of requests concurrently and save responses to disk.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;aiofiles&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;EXPORT_FILEPATH&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;init_script&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Prepare output file &amp;amp; kickoff task creation/execution.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;aiofiles&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;EXPORT_FILEPATH&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;w+&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;outfile&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;outfile&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;title,description,primary_tag,url,published_at&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;# (The rest of our script logic will be executed here).
&lt;/span&gt;        &lt;span class="c1"&gt;# ...
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
aiohttp_aiofiles_tutorial/__init__.py




&lt;p&gt;Our script begins by opening a file context with &lt;code&gt;aiofiles&lt;/code&gt;. As long as our script operates inside the context of an open async file via &lt;code&gt;async with aiofiles.open() as outfile:&lt;/code&gt;, we can write to this file constantly without worrying about opening and closing the file.&lt;/p&gt;

&lt;p&gt;Compare this to the &lt;em&gt;synchronous&lt;/em&gt; (default) implementation of handling file I/O in Python, &lt;code&gt;with open() as outfile:&lt;/code&gt;. By using &lt;code&gt;aiofiles&lt;/code&gt;, we can write data to the same file from multiple sources at &lt;em&gt;virtually the same time.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;EXPORT_FILEPATH&lt;/code&gt; happens to target a CSV ( &lt;strong&gt;/export/hackers_pages_metadata.csv&lt;/strong&gt; ). Every CSV needs a row of headers; hence our one-off usage of &lt;code&gt;await outfile.write()&lt;/code&gt; to write headers immediately after opening our CSV:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="bp"&gt;...&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;outfile&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;title,description,primary_tag,url,published_at&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Writing a single row to our CSV consisting of column headers



&lt;h3&gt;
  
  
  Moving Along
&lt;/h3&gt;

&lt;p&gt;Below is the fully fleshed-out version of &lt;strong&gt;__init__.py&lt;/strong&gt; that will ultimately put our script into action. The most notable addition is the introduction of the &lt;code&gt;execute_fetcher_tasks()&lt;/code&gt; coroutine; we'll dissect this one piece at a time:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Make hundreds of requests concurrently and save responses to disk.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;aiofiles&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;aiofiles.threadpool.text&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AsyncTextIOWrapper&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;AsyncIOFile&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;aiohttp&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ClientSession&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;EXPORT_FILEPATH&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;HTTP_HEADERS&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;.data&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;urls_to_fetch&lt;/span&gt; &lt;span class="c1"&gt;# URLs parsed from a CSV
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;.tasks&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;create_tasks&lt;/span&gt; &lt;span class="c1"&gt;# Creates one task per URL
&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;init_script&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Prepare output file; begin task creation/execution.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;aiofiles&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;EXPORT_FILEPATH&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;w+&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;outfile&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;outfile&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;title,description,primary_tag,url,published_at&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;execute_fetcher_tasks&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;outfile&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;outfile&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;


&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;execute_fetcher_tasks&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;outfile&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;AsyncIOFile&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Open async HTTP session &amp;amp; execute created tasks.

    :param AsyncIOFile outfile: Path of file to write to.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;ClientSession&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;HTTP_HEADERS&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;task_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;create_tasks&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;urls_to_fetch&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;outfile&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;gather&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;task_list&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
aiohttp_aiofiles_tutorial/__init__.py




&lt;p&gt;&lt;code&gt;execute_fetcher_tasks()&lt;/code&gt; is broken out mainly to organize our code. This coroutine accepts &lt;code&gt;outfile&lt;/code&gt; as a parameter, which will serve as the destination for data we end up parsing. Taking this line-by-line:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;async with ClientSession(headers=HTTP_HEADERS) as session&lt;/code&gt;: Unlike the Python &lt;strong&gt;requests&lt;/strong&gt; library, &lt;strong&gt;aiohttp&lt;/strong&gt; enables us to open a client-side session that creates a connection pool that allows for up to &lt;em&gt;100 active connections at a single time.&lt;/em&gt; Because we're going to make under 200 requests, the amount of time it will take to fetch &lt;em&gt;all&lt;/em&gt; these URLs will be comparable to the time it takes Python to fetch two under normal circumstances.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;create_tasks()&lt;/code&gt;: This function we're about to define and accepts three parameters. The first is the async &lt;code&gt;ClientSession&lt;/code&gt; we just opened a line earlier. Next, we have the &lt;code&gt;urls_to_fetch&lt;/code&gt; variable (imported earlier in our script). This is a simple Python list of strings, where each string is a URL parsed from our earlier "input" CSV. That logic is handled elsewhere via a &lt;a href="https://github.com/hackersandslackers/aiohttp-aiofiles-tutorial/blob/master/aiohttp_aiofiles_tutorial/data/urls.py" rel="noopener noreferrer"&gt;simple function&lt;/a&gt; (and not important for the purpose of this tutorial). Lastly, our &lt;code&gt;outfile&lt;/code&gt; is passed along, as we'll be writing to this file later. With these parameters, &lt;code&gt;create_tasks()&lt;/code&gt; will create a task for each of our 174 URLs. Each of which will download the contents of the given URL to the target directory. This function returns the tasks but will not execute them until we give the word, which happens via...&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;asyncio.gather(*task_list)&lt;/code&gt;: Asyncio's &lt;code&gt;gather()&lt;/code&gt; method performs a collection of tasks inside the currently running event loop. Once this kicks off, the speed benefits of asynchronous I/O will become immediately apparent.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Creating Asyncio Tasks
&lt;/h2&gt;

&lt;p&gt;If you recall, a Python &lt;code&gt;Task&lt;/code&gt; wraps a function (coroutine) which we'll execute in the future. In addition, each task can be temporarily put on hold for other tasks. A predefined coroutine must be passed along with the proper parameters before execution to create a task.&lt;/p&gt;

&lt;p&gt;I separated &lt;code&gt;create_tasks()&lt;/code&gt; to return a list of Python Tasks, where each "task" will execute fetching one of our URLs:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Prepare tasks to be executed.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;aiofiles.threadpool.text&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AsyncTextIOWrapper&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;AsyncIOFile&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;aiohttp&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ClientSession&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;.fetcher&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;fetch_url_and_save_data&lt;/span&gt;


&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_tasks&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ClientSession&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;urls&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;outfile&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;AsyncIOFile&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Create asyncio tasks to parse HTTP request responses.

    :param ClientSession session: Async HTTP requests session.
    :param List[str] urls: Resource URLs to fetch.
    :param AsyncIOFile outfile: Path of file to write to.

    :returns: List[Task]
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;task_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;urls&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="nf"&gt;fetch_url_and_save_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;outfile&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;urls&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="n"&gt;i&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;task_list&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;task_list&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
aiohttp_aiofiles_tutorial/tasks.py




&lt;p&gt;A few notable things about Asyncio Tasks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;We're defining "&lt;em&gt;work is to be done"&lt;/em&gt; upfront. The creation of a &lt;code&gt;Task&lt;/code&gt; doesn't execute code. Our script will essentially run the same function 174 times concurrently, with different parameters. It makes sense that we'd want to define these tasks upfront.&lt;/li&gt;
&lt;li&gt;Defining tasks is quick and straightforward. In an instant, each URL from our CSV will have a corresponding Task created and added to &lt;code&gt;task_list&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;With our tasks prepared, there's only one thing left to do to kick them all off and get the party started. That's where the &lt;code&gt;asyncio.gather(*task_list)&lt;/code&gt; line from __ &lt;strong&gt;init&lt;/strong&gt; __ &lt;strong&gt;.py&lt;/strong&gt; comes into play.&lt;/li&gt;
&lt;/ul&gt;



&lt;p&gt;Asyncio's Task object is a class in itself with its attributes and methods, essentially providing a wrapper with ways to &lt;a href="https://docs.python.org/3/library/asyncio-task.html#asyncio.Task.cancel" rel="noopener noreferrer"&gt;check task status&lt;/a&gt;, &lt;a href="https://docs.python.org/3/library/asyncio-task.html#asyncio.Task.cancel" rel="noopener noreferrer"&gt;cancel tasks&lt;/a&gt;, and &lt;a href="https://docs.python.org/3/library/asyncio-task.html#task-object" rel="noopener noreferrer"&gt;so forth.&lt;/a&gt;&lt;/p&gt;


&lt;h3&gt;
  
  
  Executing our Tasks
&lt;/h3&gt;

&lt;p&gt;Back in &lt;code&gt;create_tasks()&lt;/code&gt;, we created tasks that each individually execute a method called &lt;code&gt;fetch_url_and_save_data()&lt;/code&gt; per task. This function does three things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Make an async request to the given task's URL via &lt;strong&gt;aiohttp&lt;/strong&gt;'s session context (handled by &lt;code&gt;async with session.get(url) as resp:&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Read the body of the response as a string.&lt;/li&gt;
&lt;li&gt;Write the contents of the response body to a file by passing &lt;code&gt;html&lt;/code&gt; to our last function, &lt;code&gt;parse_html_page_metadata()&lt;/code&gt;:
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Fetch URLs, extract their contents, and write parsed data to file.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;aiofiles.threadpool.text&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AsyncTextIOWrapper&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;AsyncIOFile&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;aiohttp&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ClientError&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ClientSession&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;InvalidURL&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;logger&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;LOGGER&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;.parser&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;parse_html_page_metadata&lt;/span&gt;


&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fetch_url_and_save_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ClientSession&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;outfile&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;AsyncIOFile&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;total_count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Fetch raw HTML from a URL prior to parsing.

    :param ClientSession session: Async HTTP requests session.
    :param str url: Target URL to be fetched.
    :param AsyncIOFile outfile: Path of file to write to.
    :param int total_count: Total number of URLs to fetch.
    :param int i: Current iteration of URL out of total URLs.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;pass&lt;/span&gt;
            &lt;span class="n"&gt;html&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="n"&gt;page_metadata&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;parse_html_page_metadata&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;url&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;outfile&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;page_metadata&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;LOGGER&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Fetched URL &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; of &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;total_count&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;page_metadata&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;InvalidURL&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;LOGGER&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Unable to fetch invalid URL `&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;`: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;ClientError&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;LOGGER&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ClientError while fetching URL `&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;`: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;LOGGER&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Unexpected error while fetching URL `&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;`: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
aiohttp_aiofiles_tutorial/fetcher.py




&lt;p&gt;When fetching a URL via an &lt;strong&gt;aiohttp&lt;/strong&gt; &lt;code&gt;ClientSession&lt;/code&gt;, calling the &lt;code&gt;.text()&lt;/code&gt; method on the response (&lt;code&gt;await resp.text()&lt;/code&gt;) will return the response of a request as a &lt;em&gt;string&lt;/em&gt;. This is not to be confused with &lt;code&gt;.body()&lt;/code&gt;, which returns a &lt;em&gt;bytes&lt;/em&gt; object (useful for pulling media files or anything besides a string).&lt;/p&gt;

&lt;p&gt;If you're keeping track, we're now three "contexts" deep:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;We started our script by opening an &lt;code&gt;aiofiles.open()&lt;/code&gt; context, which will remain open until our script is complete. This allows us to write to our &lt;code&gt;outfile&lt;/code&gt; from any task for the duration of our script.&lt;/li&gt;
&lt;li&gt;After writing headers to our CSV file, we opened a persistent client request session with &lt;code&gt;async with ClientSession() as session&lt;/code&gt;, which allows us to make requests en masse as long as the session is open.&lt;/li&gt;
&lt;li&gt;In the snippet above, we've entered a third and final context: the response context for a single URL (via &lt;code&gt;async with session.get(url) as resp&lt;/code&gt;). Unlike the other two contexts, we'll be entering and leaving this context 174 times (once per URL).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Inside each URL response context is where we finally start producing some output. This leaves us with our final bit of logic (await parse_html_page_metadata(html, url)) which parses each URL response and returns some scraped metadata from the page before writing said metadata to our &lt;code&gt;outfile&lt;/code&gt; on the next line, &lt;code&gt;await outfile.write(f"{page_metadata}\n")&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;
  
  
  Write Parsed Metadata to CSV
&lt;/h2&gt;

&lt;p&gt;How are we planning to rip metadata out of HTML pages, you ask? With &lt;a href="https://hackersandslackers.com/scraping-urls-with-beautifulsoup/" rel="noopener noreferrer"&gt;BeautifulSoup&lt;/a&gt;, of course! With the HTML of an HTTP response in hand, we use &lt;code&gt;bs4&lt;/code&gt; to parse each URL response and return values for each of the columns in our &lt;code&gt;outfile&lt;/code&gt;: &lt;strong&gt;title&lt;/strong&gt; , &lt;strong&gt;description&lt;/strong&gt; , &lt;strong&gt;primary_tag&lt;/strong&gt; , &lt;strong&gt;published at&lt;/strong&gt; , and &lt;strong&gt;url&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;These five values are returned as a comma-separated string, then written to our &lt;code&gt;outfile&lt;/code&gt; CSV as a single row.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Parse metadata from raw HTML.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;bs4&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BeautifulSoup&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;bs4.builder&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ParserRejectedMarkup&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;logger&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;LOGGER&lt;/span&gt;


&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;parse_html_page_metadata&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Extract page metadata from raw HTML into a CSV row.

    :param str html: Raw HTML source of a given fetched URL.
    :param str url: URL associated with the extracted HTML.

    :returns: str
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;soup&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;BeautifulSoup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;html.parser&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;soup&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;soup&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;head&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;select_one&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;meta[name=description]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'"'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;`&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"'"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;`&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;primary_tag&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;soup&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;head&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;select_one&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;meta[property=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;article:tag&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;published_at&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;soup&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;head&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;select_one&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                 &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;meta[property=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;article:published_time&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;T&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;primary_tag&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;primary_tag&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;primary_tag&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;published_at&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;ParserRejectedMarkup&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;LOGGER&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Failed to parse invalid html for &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;ValueError&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;LOGGER&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ValueError occurred when parsing html for &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;LOGGER&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Parsing failed when parsing html for &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
aiohttp_aiofiles_tutorial/parser.py



&lt;h2&gt;
  
  
  Run the Jewels, Run the Script
&lt;/h2&gt;

&lt;p&gt;Let's take this bad boy for a spin. I threw a timer into &lt;strong&gt;__init__.py&lt;/strong&gt; to log the number of seconds that elapse for the duration of the script:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Make hundreds of requests concurrently and save responses to disk.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;perf_counter&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;timer&lt;/span&gt;

&lt;span class="bp"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;init_script&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Prepare output file &amp;amp; task creation/execution.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;start_time&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;timer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c1"&gt;# Add timer to function
&lt;/span&gt;    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;aiofiles&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;EXPORT_FILEPATH&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;w+&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;outfile&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;outfile&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;title,description,primary_tag,url,published_at&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;execute_fetcher_tasks&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;outfile&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;outfile&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;LOGGER&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;success&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
         &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Executed &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; in &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perf_counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start_time&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; seconds.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
     &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# Log time of execution
&lt;/span&gt;
&lt;span class="bp"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
aiohttp_aiofiles_tutorial/__init__.py




&lt;p&gt;Mash that mfing &lt;code&gt;make run&lt;/code&gt; command if you're following along in the repo (or just punch in &lt;code&gt;python3 main.py&lt;/code&gt;). Strap yourself in:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;...
16:12:34 PM | INFO: Fetched URL 165 of 173: Setting up a MySQL Database on Ubuntu, Setting up MySQL the old-fashioned way: on a linux server, DevOps, https://hackersandslackers.com/set-up-mysql-database/, 2018-04-17 
16:12:34 PM | INFO: Fetched URL 164 of 173: Dropping Rows of Data Using Pandas, Square one of cleaning your Pandas Dataframes: dropping empty or problematic data., Data Analysis, https://hackersandslackers.com/pandas-dataframe-drop/, 2018-04-18 
16:12:34 PM | INFO: Fetched URL 167 of 173: Installing Django CMS on Ubuntu, Get the play-by-play on how to &lt;span class="nb"&gt;install &lt;/span&gt;DjangoCMS: the largest of three major CMS products &lt;span class="k"&gt;for &lt;/span&gt;Python&lt;span class="sb"&gt;`&lt;/span&gt;s Django framework., Software, https://hackersandslackers.com/installing-django-cms/, 2017-11-19 
16:12:34 PM | INFO: Fetched URL 166 of 173: Starting a Python Web App with Flask &amp;amp; Heroku, Pairing Flask with zero-effort container deployments is a deadly path to addiction., Architecture, https://hackersandslackers.com/flask-app-heroku/, 2018-02-13 
16:12:34 PM | INFO: Fetched URL 171 of 173: Another &lt;span class="s1"&gt;'Intro to Data Analysis in Python Using Pandas'&lt;/span&gt; Post, An introduction to Python&lt;span class="sb"&gt;`&lt;/span&gt;s quintessential data analysis library., Data Analysis, https://hackersandslackers.com/intro-python-pandas/, 2017-11-16 
16:12:34 PM | INFO: Fetched URL 172 of 173: Managing Python Environments With Virtualenv, Embrace core best-practices &lt;span class="k"&gt;in &lt;/span&gt;Python by managing your Python packages using virtualenv and virtualenvwrapper., Software, https://hackersandslackers.com/python-virtualenv-virtualenvwrapper/, 2017-11-15 
16:12:34 PM | INFO: Fetched URL 170 of 173: Visualize Folder Structures with Python’s Treelib, Using Python&lt;span class="sb"&gt;`&lt;/span&gt;s treelib library to output the contents of &lt;span class="nb"&gt;local &lt;/span&gt;directories as visual tree representations., Data Engineering, https://hackersandslackers.com/python-tree-hierachies-treelib/, 2017-11-17 
16:12:34 PM | INFO: Fetched URL 169 of 173: Merge Sets of Data &lt;span class="k"&gt;in &lt;/span&gt;Python Using Pandas, Perform SQL-like merges of data using Python&lt;span class="sb"&gt;`&lt;/span&gt;s Pandas., Data Analysis, https://hackersandslackers.com/merge-dataframes-with-pandas/, 2017-11-17 
16:12:34 PM | INFO: Fetched URL 168 of 173: Starting an ExpressJS App, Installation guide &lt;span class="k"&gt;for &lt;/span&gt;ExpressJS with popular customization options., JavaScript, https://hackersandslackers.com/create-an-expressjs-app/, 2017-11-18 
16:12:34 PM | SUCCESS: Executed aiohttp_aiofiles_tutorial &lt;span class="k"&gt;in &lt;/span&gt;2.96 seconds. 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
The tail end of our log after fetching 174 pages in ~3 seconds




&lt;p&gt;The higher end of our script's execution time is 3 seconds. A typical Python request takes 1-2 seconds to complete, so our speed optimization is in the range of &lt;em&gt;hundreds of times faster&lt;/em&gt; for a sample size of data like this.&lt;/p&gt;

&lt;p&gt;Writing async scripts in Python surely takes more effort, but not &lt;em&gt;hundreds&lt;/em&gt; or &lt;em&gt;thousands&lt;/em&gt; of times more effort. Even if isn't speed you're after, handling &lt;em&gt;volume&lt;/em&gt; of larger-scale applications renders Asyncio absolutely critical. For example, if your chatbot or webserver is in the middle of handling a user's request, what happens when a second user attempts to interact with your app in the meantime? Often times the answer is &lt;em&gt;nothing:&lt;/em&gt; &lt;strong&gt;User 1&lt;/strong&gt; gets what they want, and &lt;strong&gt;User 2&lt;/strong&gt; is stuck taking to a blocked thread.&lt;/p&gt;

&lt;p&gt;Anyway, seeing is believing. Here's the source code for this tutorial:&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev.to%2Fassets%2Fgithub-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/hackersandslackers" rel="noopener noreferrer"&gt;
        hackersandslackers
      &lt;/a&gt; / &lt;a href="https://github.com/hackersandslackers/aiohttp-aiofiles-tutorial" rel="noopener noreferrer"&gt;
        aiohttp-aiofiles-tutorial
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
       🔄 🌐  Handle thousands of HTTP requests, disk writes, and other I/O-bound tasks simultaneously with Python's quintessential async libraries.
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;Asynchronous HTTP Requests Tutorial&lt;/h1&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a rel="noopener noreferrer nofollow" href="https://camo.githubusercontent.com/b7e4df7b61fb4619b46349921b17562fec1bc245e757d92215f6719c358834e9/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f507974686f6e2d76253545332e392d626c75652e7376673f6c6f676f3d707974686f6e266c6f6e6743616368653d74727565266c6f676f436f6c6f723d776869746526636f6c6f72423d356538316163267374796c653d666c61742d73717561726526636f6c6f72413d346335363661"&gt;&lt;img src="https://camo.githubusercontent.com/b7e4df7b61fb4619b46349921b17562fec1bc245e757d92215f6719c358834e9/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f507974686f6e2d76253545332e392d626c75652e7376673f6c6f676f3d707974686f6e266c6f6e6743616368653d74727565266c6f676f436f6c6f723d776869746526636f6c6f72423d356538316163267374796c653d666c61742d73717561726526636f6c6f72413d346335363661" alt="Python"&gt;&lt;/a&gt;
&lt;a rel="noopener noreferrer nofollow" href="https://camo.githubusercontent.com/014c6980bbb3ab25a6f927c1aaf18a0309fd4305ce0971a613ccd72e9dcfd516/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4173796e63696f2d76253545332e342e332d626c75652e7376673f6c6f6e6743616368653d74727565266c6f676f3d707974686f6e267374796c653d666c61742d737175617265266c6f676f436f6c6f723d776869746526636f6c6f72423d35653831616326636f6c6f72413d346335363661"&gt;&lt;img src="https://camo.githubusercontent.com/014c6980bbb3ab25a6f927c1aaf18a0309fd4305ce0971a613ccd72e9dcfd516/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4173796e63696f2d76253545332e342e332d626c75652e7376673f6c6f6e6743616368653d74727565266c6f676f3d707974686f6e267374796c653d666c61742d737175617265266c6f676f436f6c6f723d776869746526636f6c6f72423d35653831616326636f6c6f72413d346335363661" alt="Asyncio"&gt;&lt;/a&gt;
&lt;a rel="noopener noreferrer nofollow" href="https://camo.githubusercontent.com/dec6f8073f8423617d1eae4e7afac75c8ea28bd5c82e3bb98a3dfedae64c55a5/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f41696f687474702d76253545332e382e312d626c75652e7376673f6c6f6e6743616368653d74727565266c6f676f3d707974686f6e267374796c653d666c61742d737175617265266c6f676f436f6c6f723d776869746526636f6c6f72423d35653831616326636f6c6f72413d346335363661"&gt;&lt;img src="https://camo.githubusercontent.com/dec6f8073f8423617d1eae4e7afac75c8ea28bd5c82e3bb98a3dfedae64c55a5/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f41696f687474702d76253545332e382e312d626c75652e7376673f6c6f6e6743616368653d74727565266c6f676f3d707974686f6e267374796c653d666c61742d737175617265266c6f676f436f6c6f723d776869746526636f6c6f72423d35653831616326636f6c6f72413d346335363661" alt="Aiohttp"&gt;&lt;/a&gt;
&lt;a rel="noopener noreferrer nofollow" href="https://camo.githubusercontent.com/2cbb3dd1ca0701c618ce4d1d9a81e96f55d805d34f282106439a06b8c94a8908/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f41696f66696c65732d76253545302e382e302d626c75652e7376673f6c6f6e6743616368653d74727565266c6f676f3d707974686f6e267374796c653d666c61742d737175617265266c6f676f436f6c6f723d776869746526636f6c6f72423d35653831616326636f6c6f72413d346335363661"&gt;&lt;img src="https://camo.githubusercontent.com/2cbb3dd1ca0701c618ce4d1d9a81e96f55d805d34f282106439a06b8c94a8908/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f41696f66696c65732d76253545302e382e302d626c75652e7376673f6c6f6e6743616368653d74727565266c6f676f3d707974686f6e267374796c653d666c61742d737175617265266c6f676f436f6c6f723d776869746526636f6c6f72423d35653831616326636f6c6f72413d346335363661" alt="Aiofiles"&gt;&lt;/a&gt;
&lt;a rel="noopener noreferrer nofollow" href="https://camo.githubusercontent.com/d097967f88f18cb4e4f6ae12bf76b62102e75c001da7f829020069bda0138dd9/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f6c6173742d636f6d6d69742f676f6f676c652f736b69612e7376673f7374796c653d666c61742d73717561726526636f6c6f72413d34633536366126636f6c6f72423d613362653863266c6f676f3d476974487562"&gt;&lt;img src="https://camo.githubusercontent.com/d097967f88f18cb4e4f6ae12bf76b62102e75c001da7f829020069bda0138dd9/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f6c6173742d636f6d6d69742f676f6f676c652f736b69612e7376673f7374796c653d666c61742d73717561726526636f6c6f72413d34633536366126636f6c6f72423d613362653863266c6f676f3d476974487562" alt="GitHub Last Commit"&gt;&lt;/a&gt;
&lt;a href="https://github.com/hackersandslackers/aiohttp-aiofiles-tutorial/issues" rel="noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/36c2452c9c9667ce06dbabd1bd682515ea5e9576843dd2e9cc17ff8f27689bd4/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f6973737565732f6861636b657273616e64736c61636b6572732f61696f687474702d61696f66696c65732d7475746f7269616c2e7376673f7374796c653d666c61742d73717561726526636f6c6f72413d346335363661266c6f676f3d47697448756226636f6c6f72423d656263623862" alt="GitHub Issues"&gt;&lt;/a&gt;
&lt;a href="https://github.com/hackersandslackers/aiohttp-aiofiles-tutorial/stargazers" rel="noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/5715aededee1ee47b1b03647f209119188cfb3e768b7aef57beb1e1d8a821493/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f73746172732f6861636b657273616e64736c61636b6572732f61696f687474702d61696f66696c65732d7475746f7269616c2e7376673f7374796c653d666c61742d73717561726526636f6c6f72413d346335363661266c6f676f3d47697448756226636f6c6f72423d656263623862" alt="GitHub Stars"&gt;&lt;/a&gt;
&lt;a href="https://github.com/hackersandslackers/aiohttp-aiofiles-tutorial/network" rel="noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/2fd29c70b427de0066605ebff826b7222a96658b8fa6d878848be9c7d7165604/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f666f726b732f6861636b657273616e64736c61636b6572732f61696f687474702d61696f66696c65732d7475746f7269616c2e7376673f7374796c653d666c61742d73717561726526636f6c6f72413d346335363661266c6f676f3d47697448756226636f6c6f72423d656263623862" alt="GitHub Forks"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a rel="noopener noreferrer" href="https://github.com/hackersandslackers/aiohttp-aiofiles-tutorial.github/aiohttp@2x.jpg"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fgithub.com%2Fhackersandslackers%2Faiohttp-aiofiles-tutorial.github%2Faiohttp%402x.jpg" alt="Asyncio"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Make asynchronous HTTP requests and write to disk using &lt;a href="https://docs.python.org/3/library/asyncio.html" rel="nofollow noopener noreferrer"&gt;&lt;strong&gt;asyncio&lt;/strong&gt;&lt;/a&gt;, &lt;a href="https://docs.aiohttp.org/en/stable/" rel="nofollow noopener noreferrer"&gt;&lt;strong&gt;aiohttp&lt;/strong&gt;&lt;/a&gt;, &amp;amp; &lt;a href="https://github.com/Tinche/aiofiles" rel="noopener noreferrer"&gt;&lt;strong&gt;aiofiles&lt;/strong&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Source code demonstrating asynchronous Python for the corresponding Hackersandslackers post: &lt;a href="https://hackersandslackers.com/async-requests-aiohttp-aiofiles/" rel="nofollow noopener noreferrer"&gt;https://hackersandslackers.com/async-requests-aiohttp-aiofiles/&lt;/a&gt;&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Getting Started&lt;/h2&gt;

&lt;/div&gt;
&lt;p&gt;Get up and running by cloning this repository and running &lt;code&gt;make deploy&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;$ git clone https://github.com/hackersandslackers/aiohttp-aiofiles-tutorial.git
$ &lt;span class="pl-c1"&gt;cd&lt;/span&gt; aiohttp-aiofiles-tutorial
$ make deploy&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Hackers and Slackers&lt;/strong&gt; tutorials are free of charge. If you found this tutorial helpful, a &lt;a href="https://www.buymeacoffee.com/hackersslackers" rel="nofollow noopener noreferrer"&gt;small donation&lt;/a&gt; would be greatly appreciated to keep us in business. All proceeds go towards coffee, and all coffee goes towards more content.&lt;/p&gt;
&lt;/div&gt;



&lt;/div&gt;
&lt;br&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/hackersandslackers/aiohttp-aiofiles-tutorial" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;br&gt;
&lt;/div&gt;
&lt;br&gt;Source code for this tutorial




</description>
      <category>concurrency</category>
      <category>python</category>
      <category>scraping</category>
      <category>automation</category>
    </item>
    <item>
      <title>Intro to Asynchronous Python with Asyncio</title>
      <dc:creator>Todd Birchard</dc:creator>
      <pubDate>Mon, 10 Jan 2022 14:22:00 +0000</pubDate>
      <link>https://dev.to/hackersandslackers/intro-to-asynchronous-python-with-asyncio-10c9</link>
      <guid>https://dev.to/hackersandslackers/intro-to-asynchronous-python-with-asyncio-10c9</guid>
      <description>&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.hackersandslackers.com%2F2021%2F09%2Faiohttp.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.hackersandslackers.com%2F2021%2F09%2Faiohttp.jpg" alt="Intro to Asynchronous Python with Asyncio"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It's fair to say that most &lt;strong&gt;Hackers and Slackers&lt;/strong&gt; readers share one thing in common: we like writing stuff in Python. This does not make us unique; it's reflective of a well-known and easily explainable phenomenon as Data Scientists/Engineers enter (and more recently, leave) a space previously reserved for software engineering: multi-purpose programming languages. Despite how unique these disciplines are from one another, we share a common trait. To quote the Beatles, &lt;em&gt;All we Need is Python™️&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;And yet, each journey has a defining moment where it's clear the language has surpassed its &lt;strong&gt;30th birthday&lt;/strong&gt;. It's been three decades since Guido unleashed The Serpent to tackle the problems of &lt;strong&gt;1991.&lt;/strong&gt; This was the final year of the Cold War era: a different time in history and even more so computing. Most of the design decisions behind Python were sensible for their time, but a few of these decisions have become today's "quirks." The most controversial quirk is easily the topic of &lt;strong&gt;concurrency&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Indeed, I'm referring to the &lt;a href="https://wiki.python.org/moin/GlobalInterpreterLock" rel="noopener noreferrer"&gt;Global Interpreter Lock&lt;/a&gt; (GIL). I'll save you the pain of disparaging the GIL as others have done a much better job of doing this than I can (if you're interested in the nitty-gritty and have the time, I highly recommend a piece entitled &lt;a href="https://tenthousandmeters.com/blog/python-behind-the-scenes-13-the-gil-and-its-effects-on-python-multithreading/" rel="noopener noreferrer"&gt;The GIL and its effects on Python multithreading&lt;/a&gt;). The long-and-short of the GIL is that it restricts Python from utilizing multiple CPU cores effectively, leaving your 8-core laptop running Python scripts on a single core while others lay idle.&lt;/p&gt;

&lt;p&gt;Concurrency is a complicated problem extending far beyond Python; most programming languages share a similar fate. But we're not here to lament about our circumstances; we're here to talk about asynchronous I/O.&lt;/p&gt;

&lt;h2&gt;
  
  
  Doing Multiple Things at Once in Python
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Concurrency&lt;/strong&gt; is a broad concept in programming that boils down to "doing a bunch of stuff at once." Code that is said to run concurrently generally takes one of two possible forms:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Tasks taking turns to be executed to minimize downtime of peers' tasks.&lt;/li&gt;
&lt;li&gt;Tasks which truly tun run parallel to one another, simultaneously.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Python ships with two modules that handle either approach, Respectively:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://docs.python.org/3/library/threading.html" rel="noopener noreferrer"&gt;&lt;strong&gt;Threading&lt;/strong&gt;&lt;/a&gt; (single process): Python's threading module is limited to utilizing a single processor at a given time. The &lt;strong&gt;threading&lt;/strong&gt; module enables the governance of tasks occupying a given thread. Task X runs until blocked by an external factor (such as awaiting a response to an HTTP request). In the meantime, Task Y is prioritized to execute it until Task X is ready to continue (hence "blocking I/O").&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://docs.python.org/3/library/multiprocessing.html" rel="noopener noreferrer"&gt;&lt;strong&gt;Multiprocessing&lt;/strong&gt;&lt;/a&gt; (multiple processors): The &lt;strong&gt;multiprocessing&lt;/strong&gt; module enables code to run &lt;em&gt;in parallel. A&lt;/em&gt; script is initialized and run &lt;em&gt;n&lt;/em&gt; times simultaneously across &lt;em&gt;n&lt;/em&gt; CPUs. Expanding a single road into an 8-lane highway has obvious performance benefits until it comes time to consolidate the results of each task. Multiple processes cannot simultaneously write data to the same destination (databases, files, etc.) without creating locks that block one another. For scripts intended to produce an output, attempting to wrestle this problem is almost certainly an insurmountable endeavor.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Where Does Asyncio Fit In?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://docs.python.org/3/library/asyncio.html" rel="noopener noreferrer"&gt;Asyncio&lt;/a&gt; is a third and generally preferred alternative to the approaches above. Despite being limited to a single thread, &lt;strong&gt;Asyncio&lt;/strong&gt; can execute a high volume of operations much faster than Python's native single-threaded execution. To illustrate how this is possible, consider how human beings tend to "multitask." When people claim to be "multitasking," they generally get things done by &lt;em&gt;juggling&lt;/em&gt; between tasks instead of &lt;em&gt;doing multiple things simultaneously&lt;/em&gt;. Single-threaded programs are asynchronous in the same way: output is optimized by &lt;em&gt;overlapping&lt;/em&gt; work.&lt;/p&gt;

&lt;p&gt;While humans are notoriously awful at multitasking, machines can see significant performance benefits from this practice, as they are generally better suited to start and stop work without the extraneous overhead. This concept is the "secret sauce" of frameworks such as &lt;a href="https://fastapi.tiangolo.com/" rel="noopener noreferrer"&gt;FastAPI&lt;/a&gt;, an asynchronous Python framework that self-proclaims performance to be &lt;em&gt;"on par with&lt;/em&gt; &lt;strong&gt;&lt;em&gt;NodeJS&lt;/em&gt;&lt;/strong&gt; &lt;em&gt;and&lt;/em&gt; &lt;strong&gt;&lt;em&gt;Go"&lt;/em&gt;&lt;/strong&gt; (I promise to write a fair share on FastAPI another time).&lt;/p&gt;

&lt;p&gt;It took me a while to break my skepticism of how a single thread handling multiple tasks kind-of-at-the-same-time could deliver a performance benefit worth writing about. It wasn't until I stumbled upon the concept of &lt;strong&gt;event loops&lt;/strong&gt; that things started making sense.&lt;/p&gt;

&lt;h2&gt;
  
  
  Event Loops
&lt;/h2&gt;

&lt;p&gt;We start with a backlog of I/O "tasks." They can be HTTP requests, saving things to disk, or in our case, a mix of both of those. Synchronous Python workflows (read: standard Python) would execute each task one at a time, from start to finish. It's like waiting in line at the DMV, where there's only one line, and the lady hates all of you.&lt;/p&gt;

&lt;p&gt;An event loop handles things differently to get through tasks quicker. Given many tasks, an event "loop" works by grabbing new tasks and delegating them to threads. The loop continuously checks the tasks in progress for downtime (or completion). When delegated tasks are "waiting" on an external factor (such as an HTTP request), the event loop fills the dead time by kicking off another task in the thread. If the event loop finds that an allocated task has been completed, the Task is removed from its thread, the output of that Task is collected, and the loop picks another task from the queue to occupy the thread:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.hackersandslackers.com%2F2021%2F04%2Fasync_eventloop.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.hackersandslackers.com%2F2021%2F04%2Fasync_eventloop.jpg" alt="Intro to Asynchronous Python with Asyncio"&gt;&lt;/a&gt;&lt;/p&gt;
Asynchronous I/O Event Loop



&lt;p&gt;Unlike a DMV line, an event loop works somewhat similarly to a restaurant (stick with me here). Despite having many tables and a single kitchen, a server handles their "backlog" by rotating between tables (tasks) and a kitchen (thread). It's much more efficient to take multiple food orders in succession to be prepared in a kitchen than to take single orders and wait for them to be created/served before moving to the next customer.&lt;/p&gt;

&lt;h2&gt;
  
  
  Coroutines: Functions to Run Asynchronously
&lt;/h2&gt;

&lt;p&gt;Asynchronous Python scripts don't define &lt;em&gt;functions&lt;/em&gt; - they define &lt;strong&gt;coroutines&lt;/strong&gt;. Coroutines (defined with &lt;code&gt;async def&lt;/code&gt;, as opposed to &lt;code&gt;def&lt;/code&gt;) can halt execution before completion, typically waiting on the completion of another coroutine. The snippet below demonstrates the simplest example of a coroutine:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Define a Coroutine function to be executed asynchronously.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;logger&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;LOGGER&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;simple_coroutine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;number&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Wait for a time delay &amp;amp; display number associated with coroutine.

    :param int number: Number to identify the current coroutine.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;LOGGER&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Coroutine &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;number&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; has finished executing.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
coroutines.py




&lt;p&gt;&lt;code&gt;simple_coroutine()&lt;/code&gt; halts its execution for 1 second before logging a message. Coroutines can't be invoked like regular functions; attempting to run &lt;code&gt;simple_coroutine(1)&lt;/code&gt; won't work unless it's run inside an Asyncio event loop. Luckily, creating an event loop is easy:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;coroutines&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;simple_coroutine&lt;/span&gt; &lt;span class="c1"&gt;# Import our coroutine
&lt;/span&gt;
&lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;simple_coroutine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Running a Coroutine




&lt;p&gt;&lt;code&gt;asyncio.run()&lt;/code&gt; creates an event loop, and runs the coroutine passed into it. This method of creating an event loop is best when your script has an entry point from which all logic originates. Alternatively, &lt;code&gt;asyncio.gather()&lt;/code&gt; accepts any number of coroutines, if you'd prefer to simply execute a handful of coroutines:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;coroutines&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;simple_coroutine&lt;/span&gt; &lt;span class="c1"&gt;# Import our coroutine
&lt;/span&gt;

&lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;gather&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nf"&gt;simple_coroutine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;simple_coroutine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;simple_coroutine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Running 3 coroutines inside an event loop




&lt;p&gt;Running this script will execute all three coroutines and log the following:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;1
2
3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Output of `asyncio.gather()` with three coroutines




&lt;p&gt;How long would you suppose the above operation takes to complete? 3 seconds, perhaps? Or have we managed to optimize our code with sorcery?&lt;/p&gt;

&lt;p&gt;You might be surprised to learn that running the above consistently executes in almost exactly &lt;em&gt;one second&lt;/em&gt; (or occasionally &lt;strong&gt;1.01&lt;/strong&gt; on a bad day). If we time our function's execution time using Python's built-in &lt;a href="https://docs.python.org/3/library/time.html#time.perf_counter" rel="noopener noreferrer"&gt;time.perf_counter()&lt;/a&gt;, we can see this first-hand:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;coroutines&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;simple_coroutine&lt;/span&gt; &lt;span class="c1"&gt;# Import our coroutine
&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;async_gather_example&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;start_time&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perf_counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;gather&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="nf"&gt;simple_coroutine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;simple_coroutine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;simple_coroutine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Executed &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; in &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perf_counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start_time&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; seconds.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="nf"&gt;async_example&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Track execution time of executing 3 coroutines which sleep for 1 second




&lt;p&gt;Sure enough, the script takes almost &lt;em&gt;exactly&lt;/em&gt; 1 second:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Executed async_example &lt;span class="k"&gt;in &lt;/span&gt;1.01 seconds.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Output of `async_example()`




&lt;p&gt;Our coroutine &lt;code&gt;simple_coroutine()&lt;/code&gt; takes 1 second to execute on its own. What's impressive about the above is we've called this coroutine three times and got a runtime of nearly 1 second, whereas a &lt;em&gt;synchronous&lt;/em&gt; Python script indeed would've taken 3 seconds. What's more, the overhead for executing these tasks was only less than &lt;code&gt;.01&lt;/code&gt; seconds, meaning the coroutines we completed at almost the same time.&lt;/p&gt;
&lt;h2&gt;
  
  
  Working with Tasks
&lt;/h2&gt;

&lt;p&gt;Using &lt;code&gt;asyncio.gather()&lt;/code&gt; in the above example, we side-stepped an essential data structure in Asyncio: the &lt;code&gt;Task&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Coroutines are functions that can run asynchronously. It would be nice to "manage" these functions when running hundreds or thousands of such functions in specific ways. Knowing when a coroutine fails (and what to do with it), or simply checking in on &lt;em&gt;which&lt;/em&gt; coroutine a loop is currently handling, especially when our event loop might take minutes or hours to execute or have the potential to fail.&lt;/p&gt;
&lt;h3&gt;
  
  
  Managing Tasks
&lt;/h3&gt;

&lt;p&gt;In more complex workflows, Tasks offer a number of useful methods to help us manage Tasks being executed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;.set_name([name])&lt;/code&gt; (and &lt;code&gt;.get_name()&lt;/code&gt;): Gives the task a name for the purpose of having a human-readable way to identify which task is which.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;.cancel(msg=[message])&lt;/code&gt;: Cancels a task in an event loop, allowing the loop to continue with other tasks. Useful for tasks that become unresponsive or are unlikely to complete.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;.canceled()&lt;/code&gt;: Returns &lt;code&gt;True&lt;/code&gt; if the task was canceled, or &lt;code&gt;False&lt;/code&gt; otherwise.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;.done()&lt;/code&gt;: Returns &lt;code&gt;True&lt;/code&gt; if the task was completed successfully, or &lt;code&gt;False&lt;/code&gt; otherwise.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;.result()&lt;/code&gt;: Returns the end result of the task. &lt;code&gt;canceled&lt;/code&gt; tasks will include the exception message for why the task was canceled, whereas &lt;code&gt;done&lt;/code&gt; tasks will simply return &lt;code&gt;done&lt;/code&gt;. A task that has yet to be invoked will return an &lt;code&gt;InvalidStateError&lt;/code&gt; exception.&lt;/li&gt;
&lt;li&gt;A bunch of other methods, all found in &lt;a href="https://docs.python.org/3/library/asyncio-task.html#task-object" rel="noopener noreferrer"&gt;Asyncio's Task documentation&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Creating Tasks
&lt;/h3&gt;

&lt;p&gt;Wrapping &lt;code&gt;Coroutine&lt;/code&gt;s with Asyncio &lt;code&gt;Task&lt;/code&gt; is simple. Running &lt;code&gt;asyncio.gather()&lt;/code&gt; earlier handled this for us, but this is simply a shortcut that deprives us of utilizing the upsides of Tasks, as the tasks are instantiated as generic objects and executed immediately. If we create our tasks &lt;em&gt;beforehand,&lt;/em&gt; we can associate metadata to them and execute them in an event loop when we're ready.&lt;/p&gt;

&lt;p&gt;We're going to create a new coroutine called  &lt;code&gt;create_tasks()&lt;/code&gt;, which is a coroutine that will:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Creates &lt;em&gt;n&lt;/em&gt; Task instances of &lt;code&gt;simple_coroutine()&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Assigns each Task a name upon creation.&lt;/li&gt;
&lt;li&gt;Returns all Tasks as a Python list, which can later be executed via an event loop:
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Create multiple tasks from a Coroutine.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;logger&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;LOGGER&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;asyncio_intro_part1.coroutines&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;simple_coroutine&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_tasks&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_tasks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Create n number of asyncio tasks to be executed.

    :param int num_tasks: Number of tasks to create.

    :returns: List[Task]
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;task_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;LOGGER&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Creating &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;num_tasks&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; tasks to be executed...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_tasks&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="nf"&gt;simple_coroutine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Task #&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;task_list&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;LOGGER&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Created Task: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;task_list&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
tasks.py



&lt;h3&gt;
  
  
  Tasks in Action
&lt;/h3&gt;

&lt;p&gt;With our &lt;code&gt;create_tasks()&lt;/code&gt; method defined, it's time for the fun part: seeing tasks being created, executed, and completed. At the root of our project, we'll define one last function, &lt;code&gt;async_tasks_example()&lt;/code&gt;, to demonstrate this:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="bp"&gt;...&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;.tasks&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;create_tasks&lt;/span&gt;


&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;async_tasks_example&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Create and inspect tasks to wrap simple functions.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;task_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;create_tasks&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;done&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pending&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;wait&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task_list&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;done&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;LOGGER&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;success&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;done&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; tasks completed: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_name&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;done&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;LOGGER&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;warning&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;done&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; tasks pending: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_name&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;pending&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
__init__.py




&lt;p&gt;We first assign 5 tasks created via &lt;code&gt;create_tasks()&lt;/code&gt; to the &lt;code&gt;task_list&lt;/code&gt; variable. As this occurs, we see the proper logging we added in &lt;strong&gt;tasks.py&lt;/strong&gt; :&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;17:00:53 PM | INFO: Creating 5 tasks to be executed... 
17:00:53 PM | INFO: Created Task: &amp;lt;Task pending &lt;span class="nv"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'Task #0'&lt;/span&gt; &lt;span class="nv"&gt;coro&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;simple_coroutine&lt;span class="o"&gt;()&lt;/span&gt; running at /Users/toddbirchard/Projects/asyncio-tutorial-part1/asyncio_intro_part1/coroutines.py:7&amp;gt;&amp;gt; 
17:00:53 PM | INFO: Created Task: &amp;lt;Task pending &lt;span class="nv"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'Task #1'&lt;/span&gt; &lt;span class="nv"&gt;coro&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;simple_coroutine&lt;span class="o"&gt;()&lt;/span&gt; running at /Users/toddbirchard/Projects/asyncio-tutorial-part1/asyncio_intro_part1/coroutines.py:7&amp;gt;&amp;gt; 
17:00:53 PM | INFO: Created Task: &amp;lt;Task pending &lt;span class="nv"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'Task #2'&lt;/span&gt; &lt;span class="nv"&gt;coro&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;simple_coroutine&lt;span class="o"&gt;()&lt;/span&gt; running at /Users/toddbirchard/Projects/asyncio-tutorial-part1/asyncio_intro_part1/coroutines.py:7&amp;gt;&amp;gt; 
17:00:53 PM | INFO: Created Task: &amp;lt;Task pending &lt;span class="nv"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'Task #3'&lt;/span&gt; &lt;span class="nv"&gt;coro&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;simple_coroutine&lt;span class="o"&gt;()&lt;/span&gt; running at /Users/toddbirchard/Projects/asyncio-tutorial-part1/asyncio_intro_part1/coroutines.py:7&amp;gt;&amp;gt; 
17:00:53 PM | INFO: Created Task: &amp;lt;Task pending &lt;span class="nv"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'Task #4'&lt;/span&gt; &lt;span class="nv"&gt;coro&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;simple_coroutine&lt;span class="o"&gt;()&lt;/span&gt; running at /Users/toddbirchard/Projects/asyncio-tutorial-part1/asyncio_intro_part1/coroutines.py:7&amp;gt;&amp;gt; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
The output of creating 5 tasks in tasks.py




&lt;p&gt;We subsequently execute these five tasks via &lt;code&gt;asyncio.wait(task_list)&lt;/code&gt;. &lt;a href="https://docs.python.org/3/library/asyncio-task.html#waiting-primitives" rel="noopener noreferrer"&gt;asyncio.wait()&lt;/a&gt; attempts to complete all the tasks in task_list and returns a tuple of "done" and "pending" tasks. Thanks to some added logging and the presence of task names, we can confirm all tasks completed successfully:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;17:00:54 PM | INFO: Coroutine 0 has finished executing. 
17:00:54 PM | INFO: Coroutine 1 has finished executing. 
17:00:54 PM | INFO: Coroutine 2 has finished executing. 
17:00:54 PM | INFO: Coroutine 3 has finished executing. 
17:00:54 PM | INFO: Coroutine 4 has finished executing. 
17:00:54 PM | SUCCESS: 5 tasks completed: &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'Task #1'&lt;/span&gt;, &lt;span class="s1"&gt;'Task #4'&lt;/span&gt;, &lt;span class="s1"&gt;'Task #3'&lt;/span&gt;, &lt;span class="s1"&gt;'Task #0'&lt;/span&gt;, &lt;span class="s1"&gt;'Task #2'&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Executing tasks with &lt;code&gt;asyncio.wait()&lt;/code&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Did You Get All That?
&lt;/h2&gt;

&lt;p&gt;I pray that this tutorial has been somewhat helpful as a starting point for Asyncio. Part of the reason I haven't posted new content in a while is my inability to break these concepts down into digestible pieces suitable for human consumption.&lt;/p&gt;

&lt;p&gt;If you're stuck scratching your head, there's some good news. While I originally intended Asynchronous Python to be a single post, it quickly became evident that Asyncio is a long journey. Hooray for multi-part series' of dry, technical writing!&lt;/p&gt;

&lt;p&gt;I'm aware I blew through the source code on this one, which is why I've uploaded a fully working version of this tutorial to Github below. It's yours to dig around in; here's to hoping the source code clears up any blind spots I might've missed:&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev.to%2Fassets%2Fgithub-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/hackersandslackers" rel="noopener noreferrer"&gt;
        hackersandslackers
      &lt;/a&gt; / &lt;a href="https://github.com/hackersandslackers/asyncio-tutorial-part1" rel="noopener noreferrer"&gt;
        asyncio-tutorial-part1
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      🐍🔁 Intro to concurrency in Python with Asyncio.
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;Intro to Asynchronous Python with Asyncio&lt;/h1&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a rel="noopener noreferrer nofollow" href="https://camo.githubusercontent.com/b7e4df7b61fb4619b46349921b17562fec1bc245e757d92215f6719c358834e9/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f507974686f6e2d76253545332e392d626c75652e7376673f6c6f676f3d707974686f6e266c6f6e6743616368653d74727565266c6f676f436f6c6f723d776869746526636f6c6f72423d356538316163267374796c653d666c61742d73717561726526636f6c6f72413d346335363661"&gt;&lt;img src="https://camo.githubusercontent.com/b7e4df7b61fb4619b46349921b17562fec1bc245e757d92215f6719c358834e9/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f507974686f6e2d76253545332e392d626c75652e7376673f6c6f676f3d707974686f6e266c6f6e6743616368653d74727565266c6f676f436f6c6f723d776869746526636f6c6f72423d356538316163267374796c653d666c61742d73717561726526636f6c6f72413d346335363661" alt="Python"&gt;&lt;/a&gt;
&lt;a rel="noopener noreferrer nofollow" href="https://camo.githubusercontent.com/014c6980bbb3ab25a6f927c1aaf18a0309fd4305ce0971a613ccd72e9dcfd516/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4173796e63696f2d76253545332e342e332d626c75652e7376673f6c6f6e6743616368653d74727565266c6f676f3d707974686f6e267374796c653d666c61742d737175617265266c6f676f436f6c6f723d776869746526636f6c6f72423d35653831616326636f6c6f72413d346335363661"&gt;&lt;img src="https://camo.githubusercontent.com/014c6980bbb3ab25a6f927c1aaf18a0309fd4305ce0971a613ccd72e9dcfd516/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4173796e63696f2d76253545332e342e332d626c75652e7376673f6c6f6e6743616368653d74727565266c6f676f3d707974686f6e267374796c653d666c61742d737175617265266c6f676f436f6c6f723d776869746526636f6c6f72423d35653831616326636f6c6f72413d346335363661" alt="Asyncio"&gt;&lt;/a&gt;
&lt;a rel="noopener noreferrer nofollow" href="https://camo.githubusercontent.com/d097967f88f18cb4e4f6ae12bf76b62102e75c001da7f829020069bda0138dd9/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f6c6173742d636f6d6d69742f676f6f676c652f736b69612e7376673f7374796c653d666c61742d73717561726526636f6c6f72413d34633536366126636f6c6f72423d613362653863266c6f676f3d476974487562"&gt;&lt;img src="https://camo.githubusercontent.com/d097967f88f18cb4e4f6ae12bf76b62102e75c001da7f829020069bda0138dd9/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f6c6173742d636f6d6d69742f676f6f676c652f736b69612e7376673f7374796c653d666c61742d73717561726526636f6c6f72413d34633536366126636f6c6f72423d613362653863266c6f676f3d476974487562" alt="GitHub Last Commit"&gt;&lt;/a&gt;
&lt;a href="https://github.com/hackersandslackers/flask-blueprint-tutorial/issues" rel="noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/74057cfabb56768f40e28f785487976d135aaa3ea55e7c246ab18d8cd2cc44ab/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f6973737565732f6861636b657273616e64736c61636b6572732f6173796e63696f2d7475746f7269616c2d70617274312e7376673f7374796c653d666c61742d73717561726526636f6c6f72413d346335363661266c6f676f3d47697448756226636f6c6f72423d656263623862" alt="GitHub Issues"&gt;&lt;/a&gt;
&lt;a href="https://github.com/hackersandslackers/flask-blueprint-tutorial/stargazers" rel="noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/a2ffd8e3c029181bc9ce9b8e429946cf96fbb7bf2ec7b0a2dbf0dad5e048afed/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f73746172732f6861636b657273616e64736c61636b6572732f6173796e63696f2d7475746f7269616c2d70617274312e7376673f7374796c653d666c61742d73717561726526636f6c6f72413d346335363661266c6f676f3d47697448756226636f6c6f72423d656263623862" alt="GitHub Stars"&gt;&lt;/a&gt;
&lt;a href="https://github.com/hackersandslackers/flask-blueprint-tutorial/network" rel="noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/9af5e6a39ec952399805c803f36e8bf46fe938500df71189226768c2197eedac/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f666f726b732f6861636b657273616e64736c61636b6572732f6173796e63696f2d7475746f7269616c2d70617274312e7376673f7374796c653d666c61742d73717561726526636f6c6f72413d346335363661266c6f676f3d47697448756226636f6c6f72423d656263623862" alt="GitHub Forks"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a rel="noopener noreferrer" href="https://github.com/hackersandslackers/asyncio-tutorial-part1./.github/asyncio_intro@2x.jpg"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fgithub.com%2Fhackersandslackers%2Fasyncio-tutorial-part1.%2F.github%2Fasyncio_intro%402x.jpg" alt="Asyncio"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Source code demonstrating asynchronous Python for the Hackersandslackers post: &lt;a href="https://hackersandslackers.com/intro-to-asyncio-concurrency/" rel="nofollow noopener noreferrer"&gt;&lt;strong&gt;Introduction to asynchronous Python with Asyncio&lt;/strong&gt;&lt;/a&gt;&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Getting Started&lt;/h2&gt;

&lt;/div&gt;
&lt;p&gt;Get up and running by cloning this repository and running &lt;code&gt;make deploy&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;$ git clone https://github.com/hackersandslackers/asyncio-tutorial-part1.git
$ &lt;span class="pl-c1"&gt;cd&lt;/span&gt; asyncio-tutorial-part1
$ make deploy&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Hackers and Slackers&lt;/strong&gt; tutorials are free of charge. If you found this tutorial helpful, a &lt;a href="https://www.buymeacoffee.com/hackersslackers" rel="nofollow noopener noreferrer"&gt;small donation&lt;/a&gt; would be greatly appreciated to keep us in business. All proceeds go towards coffee, and all coffee goes towards more content.&lt;/p&gt;
&lt;/div&gt;



&lt;/div&gt;
&lt;br&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/hackersandslackers/asyncio-tutorial-part1" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;br&gt;
&lt;/div&gt;
&lt;br&gt;


</description>
      <category>python</category>
      <category>software</category>
      <category>asyncio</category>
      <category>concurrency</category>
    </item>
    <item>
      <title>Create Cloud-hosted Charts with Plotly Chart Studio</title>
      <dc:creator>Todd Birchard</dc:creator>
      <pubDate>Thu, 03 Sep 2020 11:28:00 +0000</pubDate>
      <link>https://dev.to/hackersandslackers/create-cloud-hosted-charts-with-plotly-chart-studio-35l4</link>
      <guid>https://dev.to/hackersandslackers/create-cloud-hosted-charts-with-plotly-chart-studio-35l4</guid>
      <description>&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--JO4gXE6O--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2020/09/plotly-chartstudio%402x.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--JO4gXE6O--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2020/09/plotly-chartstudio%402x.jpg" alt="Create Cloud-hosted Charts with Plotly Chart Studio"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Given the success of &lt;a href="https://plotly.com/dash/"&gt;Plotly Dash&lt;/a&gt; and &lt;a href="https://plotly.com/python/plotly-express/"&gt;Plotly Express&lt;/a&gt;, it's easy to forget that Plotly's rise to success began with a product that was neither of these household names. Dash has cornered the &lt;em&gt;interactive dashboard&lt;/em&gt; market, while Plotly Express has become the defacto Python library generating &lt;em&gt;inline charts&lt;/em&gt;, particularly for Jupyter notebooks. What we &lt;em&gt;don't&lt;/em&gt; get from either of these tools is a way to create data visualizations for &lt;em&gt;any other scenario,&lt;/em&gt; such as creating a single chart to serve publicly as an image or interactive plot.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://plotly.com/chart-studio/"&gt;Plotly Chart Studio&lt;/a&gt; shines as a tool to instantly create cloud-hosted data visualizations. The syntax will appear familiar to those who have used Plotly Express (or any other Python data vis library), but Plotly Chart Studio stands alone &lt;em&gt;how&lt;/em&gt; these charts are created: on a publicly accessible cloud. With a Chart Studio account, each chart you plot is saved to a public chart studio profile, like mine:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--vLgQR0jJ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2020/08/plotlychartstudio-profile.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--vLgQR0jJ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2020/08/plotlychartstudio-profile.png" alt="Create Cloud-hosted Charts with Plotly Chart Studio"&gt;&lt;/a&gt;&lt;a href="https://chart-studio.plotly.com/~toddbirchard#/"&gt;&lt;/a&gt;&lt;a href="https://chart-studio.plotly.com/%7Etoddbirchard#/"&gt;https://chart-studio.plotly.com/~toddbirchard#/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Each chart in my profile is publicly accessible as your choice of an embedded plot, an image, raw HTML, etc. Here's an example chart as an iFrame (sorry dev.to fam, but I don't think iFrames are supported here):&lt;/p&gt;

&lt;p&gt;&lt;a href="https://plotly.com/%7Etoddbirchard/403/"&gt;https://plotly.com/~toddbirchard/403/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Every chart created with Plotly studio automatically generates interactive plots along with static image varieties like PNGs. This opens the door to possibilities for us to create charts on demand, like creating dumb Discord bots, for example:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--bOE9yFw3--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2020/08/plotlychartstudio_discord.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--bOE9yFw3--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2020/08/plotlychartstudio_discord.gif" alt="Create Cloud-hosted Charts with Plotly Chart Studio"&gt;&lt;/a&gt;Serve charts on demand&lt;/p&gt;

&lt;p&gt;This tutorial will demonstrate plot creation in Chart Studio with a fairly common real-life scenario. We'll be fetching stock price data from &lt;a href="https://iexcloud.io/docs/api/"&gt;IEX Cloud&lt;/a&gt;, transforming the data in a Pandas DataFrame, and outputting a Candlestick chart. This same workflow can be applied to create any of Plotly's &lt;a href="https://plotly.com/python/"&gt;many chart types&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting Set Up
&lt;/h2&gt;

&lt;p&gt;The two Plotly-related libraries we need are &lt;code&gt;plotly&lt;/code&gt; and &lt;code&gt;chart-studio&lt;/code&gt;. We'll install these along with the usual suspects for Python data analysis:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;requests pandas plotly chart-studio python-dotenv
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Install requirements



&lt;h3&gt;
  
  
  Accounts &amp;amp; Configuration
&lt;/h3&gt;

&lt;p&gt;You can create a Plotly Chart Studio account &lt;a href="https://chart-studio.plotly.com/feed/"&gt;here&lt;/a&gt;; get yourself set up and grab an API key. I'm also using an API key to access data from IEX, which can be substituted for whichever API credentials you happen to need:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="s"&gt;"""Configuration via environment variables."""&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;os&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;dotenv&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load_dotenv&lt;/span&gt;

&lt;span class="c1"&gt;# Load values from .env
&lt;/span&gt;&lt;span class="n"&gt;basedir&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;abspath&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dirname&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="n"&gt;__file__&lt;/span&gt; &lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;load_dotenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;basedir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'.env'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="c1"&gt;# API
&lt;/span&gt;&lt;span class="n"&gt;IEX_API_TOKEN&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'IEX_API_TOKEN'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;IEX_API_BASE_URL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;'https://cloud.iexapis.com/stable/stock/'&lt;/span&gt;

&lt;span class="c1"&gt;# Plotly
&lt;/span&gt;&lt;span class="n"&gt;PLOTLY_API_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'PLOTLY_API_KEY'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;PLOTLY_USERNAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'PLOTLY_USERNAME'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
config.py




&lt;p&gt;Save your secrets in a &lt;strong&gt;.env&lt;/strong&gt; file as always. If you're not sure what this means, please stop coding and Google what &lt;a href="https://www.google.com/search?q=environment+variables&amp;amp;rlz=1C5CHFA_enUS882US883&amp;amp;oq=environment+variables&amp;amp;aqs=chrome..69i57j0l7.3221j1j7&amp;amp;sourceid=chrome&amp;amp;ie=UTF-8"&gt;environment variables&lt;/a&gt; are. Seriously, stop DMing me that my code is "broken" because you didn't create a &lt;strong&gt;.env&lt;/strong&gt; file, or take 5 minutes to grasp what that even means.&lt;/p&gt;

&lt;p&gt;What were we talking about, again? Right, charts.&lt;/p&gt;
&lt;h3&gt;
  
  
  Initializing Chart Studio
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://pypi.org/project/chart-studio/"&gt;chart_studio&lt;/a&gt; is a standalone Python library to serve the sole purpose of saving charts to Plotly Chart Studio. We need to authenticate with Plotly once upfront before we chart anything, which we can do by importing the &lt;code&gt;set_credentials_file&lt;/code&gt; function and passing our Plotly &lt;strong&gt;username&lt;/strong&gt; and &lt;strong&gt;API key&lt;/strong&gt; :&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;chart_studio.tools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;set_credentials_file&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;config&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;PLOTLY_USERNAME&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;PLOTLY_API_KEY&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Plotly Chart Studio authentication
&lt;/span&gt;&lt;span class="n"&gt;set_credentials_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;username&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;PLOTLY_USERNAME&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;PLOTLY_API_KEY&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Authenticating with Plotly Chart Studio




&lt;p&gt;We'll come back to this later. Let's get some data.&lt;/p&gt;
&lt;h2&gt;
  
  
  Fetching &amp;amp; Preparing Data
&lt;/h2&gt;

&lt;p&gt;Beautiful charts need beautiful data. We're going to get some beautiful data using two libraries you should hopefully be sick of by now: &lt;strong&gt;requests&lt;/strong&gt; and &lt;strong&gt;Pandas&lt;/strong&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  Fetch Data from an API
&lt;/h3&gt;

&lt;p&gt;Let's &lt;code&gt;GET&lt;/code&gt; this bread! Below is a function called &lt;code&gt;fetch_stock_data()&lt;/code&gt;, which constructs a &lt;code&gt;GET&lt;/code&gt; request to our API. In my example, we're fetching 1-month's worth of stock price data for a given stock symbol:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="s"&gt;"""Fetch data from third-party API."""&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;requests&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;config&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;IEX_API_BASE_URL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;IEX_API_TOKEN&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fetch_stock_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stock_symbol&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="s"&gt;"""Fetch stock data from API"""&lt;/span&gt;
    &lt;span class="n"&gt;params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;'token'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;IEX_API_TOKEN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'includeToday'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'true'&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;f'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;IEX_API_BASE_URL&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;stock_symbol&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/chart/1m'&lt;/span&gt;
    &lt;span class="n"&gt;req&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Fetch Stock Price Data




&lt;p&gt;This is a straightforward GET request made to a REST API endpoint. The above effectively constructs and hits this URL:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;https://cloud.iexapis.com/stable/stock/net/chart/1m?token&lt;span class="o"&gt;=&lt;/span&gt;sk_54632753687697685697456&amp;amp;includeToday&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
IEX endpoint for 1-month price data




&lt;p&gt;If everything goes correctly, &lt;code&gt;req.content&lt;/code&gt; should spit out some raw JSON. Here's a glance at what that looks like:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2020-08-20"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"open"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;39.28&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"close"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;40.31&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"high"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;40.44&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"low"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;39.12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"volume"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3953482&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"uOpen"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;39.28&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"uClose"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;40.31&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"uHigh"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;40.44&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"uLow"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;39.12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"uVolume"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3953482&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"change"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1.02&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"changePercent"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;2.5961&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"label"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Aug 20"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"changeOverTime"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.059674&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2020-08-21"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"open"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;40.45&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"close"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;38.92&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"high"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;40.49&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"low"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;38.9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"volume"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3271742&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"uOpen"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;40.45&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"uClose"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;38.92&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"uHigh"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;40.49&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"uLow"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;38.9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"uVolume"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3271742&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"change"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;-1.39&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"changePercent"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;-3.4483&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"label"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Aug 21"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"changeOverTime"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.023134&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
API response




&lt;p&gt;This response might seem daunting at first glance, but it's exactly what we're looking for. We just need a little help from an old friend...&lt;/p&gt;
&lt;h3&gt;
  
  
  Parsing with Pandas
&lt;/h3&gt;

&lt;p&gt;It's time for some mo effin' Pandas! We're going to take the &lt;code&gt;data&lt;/code&gt; we fetched from our API request and pass it into a new function called &lt;code&gt;parse_data()&lt;/code&gt;, which will load our data into a Pandas DataFrame. Shoutout to the beautiful new Pandas method called &lt;code&gt;read_json()&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="s"&gt;"""Parse raw API data into Pandas DataFrame."""&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;parse_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="s"&gt;"""Parse JSON as Pandas DataFrame."""&lt;/span&gt;
    &lt;span class="n"&gt;stock_df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read_json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;stock_df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;stock_df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;loc&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;stock_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'date'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;dt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dayofweek&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;stock_df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_index&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;keys&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;stock_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'date'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;inplace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;stock_df&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Create Pandas DataFrame from raw JSON




&lt;p&gt;It's important to note that we pass in &lt;code&gt;req.content&lt;/code&gt; from our request - &lt;em&gt;not&lt;/em&gt; &lt;code&gt;req.json()&lt;/code&gt;.  &lt;code&gt;req.json()&lt;/code&gt; renders JSON as a Python dictionary, which you &lt;em&gt;could&lt;/em&gt; load into Pandas with &lt;code&gt;from_dict()&lt;/code&gt;, but whatever. Stick with me here.&lt;/p&gt;
&lt;h2&gt;
  
  
  Create a Chart with Plotly Studio
&lt;/h2&gt;

&lt;p&gt;Let's handle this one part one step at a time. There are three things we need to do here:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a Plotly chart with the data from our Pandas DataFrame.&lt;/li&gt;
&lt;li&gt;Adjust aspects of our chart's layout (such as title, color, etc.).&lt;/li&gt;
&lt;li&gt;Saving our chart to Plotly's cloud.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Each step above is nice and easy. Here's &lt;strong&gt;step 1&lt;/strong&gt; :&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="s"&gt;"""Plotly chart creation."""&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;plotly.graph_objects&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;go&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;chart_studio.plotly&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;py&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;chart_studio.tools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;set_credentials_file&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;pandas&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DataFrame&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;config&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PLOTLY_API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;PLOTLY_USERNAME&lt;/span&gt;


&lt;span class="c1"&gt;# Plotly Chart Studio authentication
&lt;/span&gt;&lt;span class="n"&gt;set_credentials_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;username&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;PLOTLY_USERNAME&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;PLOTLY_API_KEY&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_chart&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stock_df&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;symbol&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;py&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="s"&gt;"""Create Plotly chart from Pandas DataFrame."""&lt;/span&gt;
    &lt;span class="n"&gt;fig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;go&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Figure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="n"&gt;go&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Candlestick&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;stock_df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="nb"&gt;open&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;stock_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'open'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="n"&gt;high&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;stock_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'high'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="n"&gt;low&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;stock_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'low'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="n"&gt;close&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;stock_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'close'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="p"&gt;)],&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Create chart from Pandas DataFrame




&lt;p&gt;If you've ever used Plotly or Plotly Dash, this syntax should look familiar! &lt;code&gt;go.Figure()&lt;/code&gt; is a Python class used to create charts/plots/figures/whatever. We're passing &lt;code&gt;data&lt;/code&gt; as a keyword argument, which is actually just one of 3 notable arguments we could pass:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;data&lt;/strong&gt; : Accepts one &lt;em&gt;or multiple&lt;/em&gt; "trace" types. A "trace" refers to the data on our plot; in other words, we could easily chart different types of data on the same plot, such as a combination of line chart data ( &lt;code&gt;go.Line&lt;/code&gt; ) and bar chart data ( &lt;code&gt;go.Bar&lt;/code&gt; ) on the same plot. In our case, we're sticking with a single candlestick trace, where we pass in columns of our Pandas DataFrame for each value &lt;code&gt;go.Candlestick()&lt;/code&gt; expects.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;layout&lt;/strong&gt; : A plot's layout gives us the power to change anything and everything about our chart's appearance. This could be background color, trace color, plot title, margins, and so much more. Just &lt;a href="https://plotly.com/python-api-reference/generated/plotly.graph_objects.Layout.html"&gt;look at all the ways&lt;/a&gt; we can customize our plot - it's almost ridiculous!&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;frames&lt;/strong&gt; : Apparently, we could &lt;em&gt;animate&lt;/em&gt; our charts by passing multiple frames of changing data into the &lt;code&gt;frames&lt;/code&gt; keyword argument. That's just fuckin bonkers imo and not something we should worry about until we're older. &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Anyway, on to &lt;strong&gt;step 2&lt;/strong&gt; : customizing our layout:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_chart&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stock_df&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;symbol&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;py&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="s"&gt;"""Create Plotly chart from Pandas DataFrame."""&lt;/span&gt;
    &lt;span class="n"&gt;fig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;go&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Figure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="n"&gt;go&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Candlestick&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;stock_df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="nb"&gt;open&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;stock_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'open'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="n"&gt;high&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;stock_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'high'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="n"&gt;low&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;stock_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'low'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="n"&gt;close&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;stock_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'close'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="p"&gt;)],&lt;/span&gt;
        &lt;span class="n"&gt;layout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;go&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Layout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;f'30-day performance of &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;symbol&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;upper&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;xaxis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
              &lt;span class="s"&gt;'type'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'date'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
              &lt;span class="s"&gt;'rangeslider'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                  &lt;span class="s"&gt;'visible'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
              &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Modify plot layout




&lt;p&gt;We pass &lt;code&gt;layout&lt;/code&gt; into our figure immediately after passing &lt;code&gt;data&lt;/code&gt;. As we've already established, there's a shit ton of possibilities we could get deep into here. I'm keeping things modest by setting a chart title and making sure the x-axis knows it's a range of dates.&lt;/p&gt;

&lt;p&gt;We can finally save our hard work with &lt;strong&gt;step 3&lt;/strong&gt; : saving our plot to the cloud:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_chart&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stock_df&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;symbol&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;py&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="s"&gt;"""Create Plotly chart from Pandas DataFrame."""&lt;/span&gt;
    &lt;span class="n"&gt;fig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;go&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Figure&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="n"&gt;go&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Candlestick&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;stock_df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="nb"&gt;open&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;stock_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'open'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="n"&gt;high&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;stock_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'high'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="n"&gt;low&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;stock_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'low'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="n"&gt;close&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;stock_df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'close'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="p"&gt;)],&lt;/span&gt;
        &lt;span class="n"&gt;layout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;go&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Layout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;f'30-day performance of &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;symbol&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;upper&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;xaxis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
              &lt;span class="s"&gt;'type'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'date'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
              &lt;span class="s"&gt;'rangeslider'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                  &lt;span class="s"&gt;'visible'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
              &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;chart&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;py&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;fig&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;filename&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;symbol&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;auto_open&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;fileopt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'overwrite'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;sharing&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'public'&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;chart&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Create chart




&lt;p&gt;The options we're passing to &lt;code&gt;py.plot()&lt;/code&gt; are pretty straightforward:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Setting the filename.&lt;/li&gt;
&lt;li&gt;Handling file clashes by overwriting preexisting charts with the same name.&lt;/li&gt;
&lt;li&gt;Making the chart available to the public&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So, what does &lt;code&gt;chart&lt;/code&gt; actually output? Here's what a &lt;code&gt;print()&lt;/code&gt; would reveal:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;https://plotly.com/~toddbirchard/466/
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Output of &lt;code&gt;print(chart)&lt;/code&gt;




&lt;p&gt;It's a link to our chart! But wait, what if I don't want to view my chart in on Plotly's site? What if I want an image? Check out what &lt;code&gt;chart[:-1] + '.png'&lt;/code&gt; gives you:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;https://plotly.com/~toddbirchard/466.png
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;


&lt;p&gt;Which is a URL to...&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Cz7NdDjV--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2020/09/466.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Cz7NdDjV--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2020/09/466.png" alt="Create Cloud-hosted Charts with Plotly Chart Studio"&gt;&lt;/a&gt;&lt;a href="https://plotly.com/~toddbirchard/466.png"&gt;&lt;/a&gt;&lt;a href="https://plotly.com/%7Etoddbirchard/466.png"&gt;https://plotly.com/~toddbirchard/466.png&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;WE DID IT!&lt;/p&gt;
&lt;h2&gt;
  
  
  Organizing Our Work
&lt;/h2&gt;

&lt;p&gt;Throughout this tutorial, you've kinda just been watching me dump random code blocks one-by-one. If this confuses the shit out of you, it'll probably help to know that this is how I chose to structure my project:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;/plotly-chartstudio-tutorial
├── /plotly_chartstudio_tutorial
│   ├── __init__.py
│   ├── api.py
│   ├── chart.py
│   ├── data.py
│   └── log.py
├── config.py
├── main.py
├── requirements.txt
└── .env
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Project structure




&lt;p&gt;&lt;em&gt;"That doesn't tell me shit,"&lt;/em&gt; you might exclaim. Luckily, that was just a set up to tell you that the source code for this tutorial is available for up on Github. Enjoy:&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--vJ70wriM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://practicaldev-herokuapp-com.freetls.fastly.net/assets/github-logo-ba8488d21cd8ee1fee097b8410db9deaa41d0ca30b004c0c63de0a479114156f.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/hackersandslackers"&gt;
        hackersandslackers
      &lt;/a&gt; / &lt;a href="https://github.com/hackersandslackers/plotly-chartstudio-tutorial"&gt;
        plotly-chartstudio-tutorial
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      📈 📊 Create Cloud-hosted Charts with Plotly Chart Studio.
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;h1&gt;
Plotly Chart Studio Tutorial&lt;/h1&gt;
&lt;p&gt;&lt;a rel="noopener noreferrer" href="https://camo.githubusercontent.com/9e550b6188a64161b2c87d071b30d834280586f1/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f507974686f6e2d76332e382d626c75652e7376673f6c6f676f3d707974686f6e266c6f6e6743616368653d74727565266c6f676f436f6c6f723d7768697465267374796c653d666c61742d73717561726526636f6c6f72413d34633536366126636f6c6f72423d356538316163"&gt;&lt;img src="https://camo.githubusercontent.com/9e550b6188a64161b2c87d071b30d834280586f1/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f507974686f6e2d76332e382d626c75652e7376673f6c6f676f3d707974686f6e266c6f6e6743616368653d74727565266c6f676f436f6c6f723d7768697465267374796c653d666c61742d73717561726526636f6c6f72413d34633536366126636f6c6f72423d356538316163" alt="Python"&gt;&lt;/a&gt;
&lt;a rel="noopener noreferrer" href="https://camo.githubusercontent.com/23edcf8045fda335d89d3cb63e467446ea31371b/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f506c6f746c792d76342e392e302d626c75652e7376673f6c6f676f3d707974686f6e266c6f6e6743616368653d74727565266c6f676f436f6c6f723d7768697465267374796c653d666c61742d73717561726526636f6c6f72413d34633536366126636f6c6f72423d356538316163"&gt;&lt;img src="https://camo.githubusercontent.com/23edcf8045fda335d89d3cb63e467446ea31371b/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f506c6f746c792d76342e392e302d626c75652e7376673f6c6f676f3d707974686f6e266c6f6e6743616368653d74727565266c6f676f436f6c6f723d7768697465267374796c653d666c61742d73717561726526636f6c6f72413d34633536366126636f6c6f72423d356538316163" alt="Plotly"&gt;&lt;/a&gt;
&lt;a rel="noopener noreferrer" href="https://camo.githubusercontent.com/4ce33b420bac5dfeaba4597ca2f420fe6b735287/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f50616e6461732d76312e312e312d626c75652e7376673f6c6f676f3d70616e646173266c6f6e6743616368653d74727565266c6f676f436f6c6f723d7768697465267374796c653d666c61742d73717561726526636f6c6f72413d34633536366126636f6c6f72423d423438454144"&gt;&lt;img src="https://camo.githubusercontent.com/4ce33b420bac5dfeaba4597ca2f420fe6b735287/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f50616e6461732d76312e312e312d626c75652e7376673f6c6f676f3d70616e646173266c6f6e6743616368653d74727565266c6f676f436f6c6f723d7768697465267374796c653d666c61742d73717561726526636f6c6f72413d34633536366126636f6c6f72423d423438454144" alt="Pandas"&gt;&lt;/a&gt;
&lt;a rel="noopener noreferrer" href="https://camo.githubusercontent.com/934f662de404f068b056200becce47ad499efb0a/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f6c6173742d636f6d6d69742f676f6f676c652f736b69612e7376673f6c6f676f3d676974687562267374796c653d666c61742d73717561726526636f6c6f72413d34633536366126636f6c6f72423d613362653863"&gt;&lt;img src="https://camo.githubusercontent.com/934f662de404f068b056200becce47ad499efb0a/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f6c6173742d636f6d6d69742f676f6f676c652f736b69612e7376673f6c6f676f3d676974687562267374796c653d666c61742d73717561726526636f6c6f72413d34633536366126636f6c6f72423d613362653863" alt="GitHub Last Commit"&gt;&lt;/a&gt;
&lt;a href="https://github.com/hackersandslackers/plotly-chartstudio-tutorial/issues"&gt;&lt;img src="https://camo.githubusercontent.com/ff3f0c5296453a0ecdae4583bbba3486d841094e/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f6973737565732f6861636b657273616e64736c61636b6572732f706c6f746c792d636861727473747564696f2d7475746f7269616c2e7376673f6c6f676f3d676974687562267374796c653d666c61742d73717561726526636f6c6f72413d34633536366126636f6c6f72423d656263623862" alt="GitHub Issues"&gt;&lt;/a&gt;
&lt;a href="https://github.com/hackersandslackers/plotly-chartstudio-tutorial/stargazers"&gt;&lt;img src="https://camo.githubusercontent.com/373cbee6bb951b20ca6837b226e8a656b7eebcd0/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f73746172732f6861636b657273616e64736c61636b6572732f706c6f746c792d636861727473747564696f2d7475746f7269616c2e7376673f6c6f676f3d676974687562267374796c653d666c61742d73717561726526636f6c6f72413d34633536366126636f6c6f72423d656263623862" alt="GitHub Stars"&gt;&lt;/a&gt;
&lt;a href="https://github.com/hackersandslackers/plotly-chartstudio-tutorial/network"&gt;&lt;img src="https://camo.githubusercontent.com/3cc4d41ca68c370fe6ef6a951df8c386d268aad1/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f666f726b732f6861636b657273616e64736c61636b6572732f706c6f746c792d636861727473747564696f2d7475746f7269616c2e7376673f6c6f676f3d676974687562267374796c653d666c61742d73717561726526636f6c6f72413d34633536366126636f6c6f72423d656263623862" alt="GitHub Forks"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Use Pandas and Plotly to create cloud-hosted data visualizations on demand in Python. Source for the accompanying tutorial: &lt;a href="https://hackersandslackers.com/plotly-chart-studio/" rel="nofollow"&gt;https://hackersandslackers.com/plotly-chart-studio/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a rel="noopener noreferrer" href="https://raw.githubusercontent.com/hackersandslackers/plotly-chartstudio-tutorial/master/./.github/plotly-chartstudio@2x.jpg"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--NKh0p9_t--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://raw.githubusercontent.com/hackersandslackers/plotly-chartstudio-tutorial/master/./.github/plotly-chartstudio%402x.jpg" alt="Plotly Chart Studio"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
Installation&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Installation via &lt;code&gt;requirements.txt&lt;/code&gt;&lt;/strong&gt;:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;$ git clone https://github.com/hackersandslackers/plotly-chartstudio-tutorial.git
$ &lt;span class="pl-c1"&gt;cd&lt;/span&gt; plotly-chartstudio-tutorial
$ python3 -m venv myenv
$ &lt;span class="pl-c1"&gt;source&lt;/span&gt; myenv/bin/activate
$ pip3 install -r requirements.txt
$ python3 main.py&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Installation via &lt;a href="https://pipenv-fork.readthedocs.io/en/latest/" rel="nofollow"&gt;Pipenv&lt;/a&gt;&lt;/strong&gt;:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;$ git clone https://github.com/hackersandslackers/plotly-chartstudio-tutorial.git
$ &lt;span class="pl-c1"&gt;cd&lt;/span&gt; plotly-chartstudio-tutorial
$ pipenv shell
$ pipenv update
$ python3 main.py&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Installation via &lt;a href="https://python-poetry.org/" rel="nofollow"&gt;Poetry&lt;/a&gt;&lt;/strong&gt;:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;$ git clone https://github.com/hackersandslackers/plotly-chartstudio-tutorial.git
$ &lt;span class="pl-c1"&gt;cd&lt;/span&gt; plotly-chartstudio-tutorial
$ poetry shell
$ poetry update
$ poetry run&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Hackers and Slackers&lt;/strong&gt; tutorials are free of charge. If you found this tutorial helpful, a &lt;a href="https://www.buymeacoffee.com/hackersslackers" rel="nofollow"&gt;small donation&lt;/a&gt; would be greatly appreciated to keep us in business. All proceeds go towards coffee, and all coffee goes towards more content.&lt;/p&gt;
&lt;/div&gt;

  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/hackersandslackers/plotly-chartstudio-tutorial"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;



</description>
      <category>plotly</category>
      <category>python</category>
      <category>datascience</category>
      <category>dataanalysis</category>
    </item>
    <item>
      <title>Scrape Structured Data with Python and Extruct</title>
      <dc:creator>Todd Birchard</dc:creator>
      <pubDate>Sat, 08 Aug 2020 14:57:00 +0000</pubDate>
      <link>https://dev.to/hackersandslackers/scrape-structured-data-with-python-and-extruct-109l</link>
      <guid>https://dev.to/hackersandslackers/scrape-structured-data-with-python-and-extruct-109l</guid>
      <description>&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--uBZL4t_G--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2020/07/json-ld-pyld-1%402x.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--uBZL4t_G--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2020/07/json-ld-pyld-1%402x.jpg" alt="Scrape Structured Data with Python and Extruct"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Unless you're entirely oblivious to scraping data in Python (and probably ended up here by accident), you're well-aware that scraping data in Python library begins and ends with &lt;a href="https://pypi.org/project/beautifulsoup4/"&gt;&lt;strong&gt;BeautifulSoup&lt;/strong&gt;&lt;/a&gt;. BeautifulSoup is Python's scraping powerhouse: we first demonstrated this in a &lt;a href="https://hackersandslackers.com/scraping-urls-with-beautifulsoup/"&gt;previous post&lt;/a&gt; where we put together a script to fetch site metadata (title, description, preview images, etc.) from any target URL. We were able to build a scraper which fetched a target site's &lt;code&gt;&amp;lt;meta&amp;gt;&lt;/code&gt; tags (and various fallbacks) to create a fairly reliable tool to summarize the contents of any URL; which is precisely the logic used to generate link "previews" such as these:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--3lzmcLb9--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/l5dysm3v7b6zpz6urhn1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--3lzmcLb9--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/l5dysm3v7b6zpz6urhn1.png" alt="Link Preview"&gt;&lt;/a&gt;&lt;/p&gt;
Example of a preview link with data fetched via BeatifulSoup.



&lt;p&gt;Perusing the various sites and entities we refer to as "the internet" has traditionally felt like navigating an unstandardized wild-west. There's never a guarantee that the website you're targeting adheres to any web standards (despite their own best interests). These situations lead us to write scripts with complicated fallbacks in case the owner of &lt;strong&gt;myhorriblewebsite.angelfire.com&lt;/strong&gt; somehow managed to forget to give their page a &lt;code&gt;&amp;lt;title&amp;gt;&lt;/code&gt;, and so forth. Search engines and other big players recognized this. The standardization of &lt;a href="https://json-ld.org/"&gt;JSON-LD&lt;/a&gt; was born as a reliable format for site publishers to include machine-readable (and also quite human-readable) metadata to appease search engines and fight for relevancy.&lt;/p&gt;

&lt;p&gt;This post is going to build upon the goal of scraping site metadata we previously explored with BeautifulSoup via a different method: by parsing JSON-LD metadata with Python's &lt;a href="https://github.com/scrapinghub/extruct"&gt;extruct&lt;/a&gt; library.&lt;/p&gt;

&lt;p&gt;What's so great about JSON-LD, you might ask? Aside from dodging the hellish experience of transversing the DOM by hand, JSON-LD is a specification with notable advantages to old school HTML &lt;code&gt;&amp;lt;meta&amp;gt;&lt;/code&gt; tags. The multitude of benefits can mostly be boiled down into two categories: &lt;strong&gt;data granularity&lt;/strong&gt; and &lt;strong&gt;linked data&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Data Granularity
&lt;/h2&gt;

&lt;p&gt;JSON-LD allows web pages to express an impressive amount of granular information about &lt;em&gt;what&lt;/em&gt; each page is. For instance, here's the JSON-LD for one of my posts:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"@context"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://schema.org/"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"@type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Article"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"author"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"@type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Person"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Todd Birchard"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"image"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://hackersandslackers-cdn.storage.googleapis.com/2020/04/todd@2x.jpg"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"sameAs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"https://toddbirchard.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://twitter.com/ToddRBirchard"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"keywords"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Golang, DevOps, Software Development"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"headline"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Deploy a Golang Web Application Behind Nginx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://hackersandslackers.com/deploy-golang-app-nginx/"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"datePublished"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2020-06-01T07:30:00.000-04:00"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"dateModified"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2020-06-01T09:03:55.000-04:00"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"image"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"@type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ImageObject"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://hackersandslackers-cdn.storage.googleapis.com/2020/05/golang-nginx-3.jpg"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"width"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1000"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"height"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"523"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"publisher"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"@type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Organization"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Hackers and Slackers"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"founder"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Todd Birchard"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"logo"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"@type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ImageObject"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://hackersandslackers-cdn.storage.googleapis.com/2020/03/logo-blue-full.png"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"width"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"height"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Deploy a self-hosted Go web application using Nginx as a reverse proxy. "&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mainEntityOfPage"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"@type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"WebPage"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"@id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://hackersandslackers.com"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Example JSON-LD for a Hackers and Slackers post.




&lt;p&gt;There's significantly more information stored in the above snippet than all other meta tags on the same page combined. There are surely more supported attributes in JSON-LD than traditional meta tags, yet the representation of data in a JSON hierarchy makes it &lt;em&gt;immediately&lt;/em&gt; clear how page metadata is related. It's immediately clear that we're looking at an object representing an article, written by an author, as part of an "organization."&lt;/p&gt;

&lt;p&gt;Google's explanation of the benefits of structuring metadata goes something like this:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Structured data is a standardized format for providing information about a page and classifying the page content; for example, on a recipe page, what are the ingredients, the cooking time and temperature, the calories, and so on.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;
  
  
  Type
&lt;/h3&gt;

&lt;p&gt;The term "web page" is useless ambiguous, as web pages are documents that can provide information in any number of forms. Web pages might be articles, recipes, product pages, events, and &lt;em&gt;far more.&lt;/em&gt; The official schema of possible page types includes &lt;a href="https://schema.org/docs/full.html"&gt;&lt;em&gt;over one thousand possibilities&lt;/em&gt;&lt;/a&gt; for what "type" or "subtype" a page might be considered to be. Knowing the "type" of a page reduces ambiguity, and declaring a page "type" allows us to attach type-specific metadata to pages as well! For instance, let's compare the attributes of an &lt;strong&gt;Episode&lt;/strong&gt; type to an &lt;strong&gt;Article&lt;/strong&gt; type:&lt;/p&gt;




  
    &lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
      &lt;thead&gt;
        &lt;tr&gt;
          &lt;th colspan="2"&gt;Episode&lt;/th&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
          &lt;th&gt;Property&lt;/th&gt;
          &lt;th&gt;Description&lt;/th&gt;
        &lt;/tr&gt;
      &lt;/thead&gt;

      &lt;tbody&gt;
        &lt;tr&gt;
          &lt;th class="prop-nam"&gt;
            &lt;code&gt;&lt;a href="http://schema.org/actor"&gt;actor&lt;/a&gt;&lt;/code&gt;
          &lt;/th&gt;
          &lt;td class="prop-desc"&gt;An actor, e.g. in tv, radio, movie, video games etc., or in an event. Actors can be associated with individual items or with a series, episode, clip. Supersedes &lt;a href="http://schema.org/actors"&gt;actors&lt;/a&gt;.&lt;/td&gt;
        &lt;/tr&gt;

        &lt;tr&gt;
          &lt;th class="prop-nam"&gt;
            &lt;code&gt;&lt;a href="http://schema.org/director"&gt;director&lt;/a&gt;&lt;/code&gt;
          &lt;/th&gt;
          &lt;td class="prop-desc"&gt;A director of e.g. tv, radio, movie, video gaming etc. content, or of an event. Directors can be associated with individual items or with a series, episode, clip. Supersedes &lt;a href="http://schema.org/directors"&gt;directors&lt;/a&gt;.&lt;/td&gt;
        &lt;/tr&gt;

        &lt;tr&gt;
          &lt;th class="prop-nam"&gt;
            &lt;code&gt;&lt;a href="http://schema.org/episodeNumber"&gt;episodeNumber&lt;/a&gt;&lt;/code&gt;
          &lt;/th&gt;
          &lt;td class="prop-desc"&gt;Position of the episode within an ordered group of episodes.&lt;/td&gt;
        &lt;/tr&gt;

        &lt;tr&gt;
          &lt;th class="prop-nam"&gt;
            &lt;code&gt;&lt;a href="http://schema.org/musicBy"&gt;musicBy&lt;/a&gt;&lt;/code&gt;
          &lt;/th&gt;
          &lt;td class="prop-desc"&gt;The composer of the soundtrack.&lt;/td&gt;
        &lt;/tr&gt;

        &lt;tr&gt;
          &lt;th class="prop-nam"&gt;
            &lt;code&gt;&lt;a href="http://schema.org/partOfSeason"&gt;partOfSeason&lt;/a&gt;&lt;/code&gt;
          &lt;/th&gt;
          &lt;td class="prop-desc"&gt;The season to which this episode belongs.&lt;/td&gt;
        &lt;/tr&gt;

        &lt;tr&gt;
          &lt;th class="prop-nam"&gt;
            &lt;code&gt;&lt;a href="http://schema.org/partOfSeries"&gt;partOfSeries&lt;/a&gt;&lt;/code&gt;
          &lt;/th&gt;
          &lt;td class="prop-desc"&gt;The series to which this episode or season belongs. Supersedes &lt;a href="http://schema.org/partOfTVSeries"&gt;partOfTVSeries&lt;/a&gt;.&lt;/td&gt;
        &lt;/tr&gt;

        &lt;tr&gt;
          &lt;th class="prop-nam"&gt;
            &lt;code&gt;&lt;a href="http://schema.org/productionCompany"&gt;productionCompany&lt;/a&gt;&lt;/code&gt;
          &lt;/th&gt;
          &lt;td class="prop-desc"&gt;The production company or studio responsible for the item e.g. series, video game, episode etc.&lt;/td&gt;
        &lt;/tr&gt;

        &lt;tr&gt;
          &lt;th class="prop-nam"&gt;
            &lt;code&gt;&lt;a href="http://schema.org/trailer"&gt;trailer&lt;/a&gt;&lt;/code&gt;
          &lt;/th&gt;
          &lt;td class="prop-desc"&gt;The trailer of a movie or tv/radio series, season, episode, etc.&lt;/td&gt;
        &lt;/tr&gt;

      &lt;/tbody&gt;
    &lt;/table&gt;&lt;/div&gt;
  

  
    &lt;div class="table-wrapper-paragraph"&gt;&lt;table class="definition-table"&gt;
      &lt;thead&gt;
        &lt;tr&gt;
          &lt;th colspan="2"&gt;Article&lt;/th&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
          &lt;th&gt;Property&lt;/th&gt;
          &lt;th&gt;Description&lt;/th&gt;
        &lt;/tr&gt;
      &lt;/thead&gt;

      &lt;tbody&gt;
        &lt;tr&gt;
          &lt;th class="prop-nam"&gt;
            &lt;code&gt;&lt;a href="http://schema.org/articleBody"&gt;articleBody&lt;/a&gt;&lt;/code&gt;
          &lt;/th&gt;
          &lt;td class="prop-desc"&gt;The actual body of the article.&lt;/td&gt;
        &lt;/tr&gt;

        &lt;tr&gt;
          &lt;th class="prop-nam"&gt;
            &lt;code&gt;&lt;a href="http://schema.org/articleSection"&gt;articleSection&lt;/a&gt;&lt;/code&gt;
          &lt;/th&gt;
          &lt;td class="prop-desc"&gt;Articles may belong to one or more 'sections' in a magazine or newspaper, such as Sports, Lifestyle, etc.&lt;/td&gt;
        &lt;/tr&gt;

        &lt;tr&gt;
          &lt;th class="prop-nam"&gt;
            &lt;code&gt;&lt;a title="Defined in section: pending.schema.org" class="ext ext-pending" href="http://schema.org/backstory"&gt;backstory&lt;/a&gt;&lt;/code&gt;
          &lt;/th&gt;
          &lt;td class="prop-desc"&gt;For an &lt;a href="http://schema.org/Article"&gt;Article&lt;/a&gt;, typically a &lt;a href="http://schema.org/NewsArticle"&gt;NewsArticle&lt;/a&gt;, the
            backstory property provides a textual summary giving a brief explanation of why and how an article was created. In a journalistic setting this could include information about reporting process, methods, interviews, data sources, etc.&lt;/td&gt;
        &lt;/tr&gt;

        &lt;tr&gt;
          &lt;th class="prop-nam"&gt;
            &lt;code&gt;&lt;a href="http://schema.org/pageEnd"&gt;pageEnd&lt;/a&gt;&lt;/code&gt;
          &lt;/th&gt;
          &lt;td class="prop-desc"&gt;The page on which the work ends; for example "138" or "xvi".&lt;/td&gt;
        &lt;/tr&gt;

        &lt;tr&gt;
          &lt;th class="prop-nam"&gt;
            &lt;code&gt;&lt;a href="http://schema.org/pageStart"&gt;pageStart&lt;/a&gt;&lt;/code&gt;
          &lt;/th&gt;
          &lt;td class="prop-desc"&gt;The page on which the work starts; for example "135" or "xiii".&lt;/td&gt;
        &lt;/tr&gt;

        &lt;tr&gt;
          &lt;th class="prop-nam"&gt;
            &lt;code&gt;&lt;a href="http://schema.org/pagination"&gt;pagination&lt;/a&gt;&lt;/code&gt;
          &lt;/th&gt;
          &lt;td class="prop-desc"&gt;Any description of pages that is not separated into pageStart and pageEnd; for example, "1-6, 9, 55" or "10-12, 46-49".&lt;/td&gt;
        &lt;/tr&gt;

        &lt;tr&gt;
          &lt;th class="prop-nam"&gt;
            &lt;code&gt;&lt;a href="http://schema.org/speakable"&gt;speakable&lt;/a&gt;&lt;/code&gt;
          &lt;/th&gt;
          &lt;td class="prop-desc"&gt;Indicates sections of a Web page that are particularly 'speakable' in the sense of being highlighted as being especially appropriate for text-to-speech conversion. Other sections of a page may
            also
            be usefully spoken in particular circumstances; the 'speakable' property serves to indicate the parts most likely to be generally useful for speech.&lt;br&gt;&lt;br&gt;
        &lt;/td&gt;
&lt;/tr&gt;

        &lt;tr&gt;
          &lt;th class="prop-nam"&gt;
            &lt;code&gt;&lt;a href="http://schema.org/wordCount"&gt;wordCount&lt;/a&gt;&lt;/code&gt;
          &lt;/th&gt;
          &lt;td class="prop-desc"&gt;The number of words in the text of the Article.&lt;/td&gt;
        &lt;/tr&gt;

      &lt;/tbody&gt;
    &lt;/table&gt;&lt;/div&gt;
  




&lt;p&gt;There are obviously data attributes of television shows which don't apply to news articles (such as actors, director, etc.), and vice versa. The level of specificity achievable is nearly unfathomable when we discover that types have &lt;strong&gt;subtypes&lt;/strong&gt;. For instance, our article might be an opinion piece article, which has extended the &lt;strong&gt;Article&lt;/strong&gt; type with &lt;em&gt;even more attributes&lt;/em&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  Who
&lt;/h3&gt;

&lt;p&gt;All content has a creator, yet content-creators can take many forms. Authors, publishers, and organizations could simultaneously be considered the responsible party for any given content, as these properties are not mutually exclusive. For instance, here's how my author data is parsed on posts like this one:&lt;/p&gt;




  &lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th colspan="2"&gt;Author&lt;/th&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;th&gt;Property&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;

  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;@type&lt;/td&gt;
      &lt;td&gt;Person&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;name&lt;/td&gt;
      &lt;td&gt;Todd Birchard&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;image&lt;/td&gt;
      &lt;td&gt;https://hackersandslackers-cdn.storage.googleapis.com/2020/04/todd@2x.jpg&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;sameAs&lt;/td&gt;
      &lt;td&gt;https://toddbirchard.com/&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;sameAs&lt;/td&gt;
      &lt;td&gt;https://twitter.com/ToddRBirchard&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;p&gt;What makes this data especially interesting is the values listed under the &lt;strong&gt;sameAs&lt;/strong&gt; attribute, which associates the "Todd Birchard" in question to the very same Todd Birchard of the website &lt;a href="https://toddbirchard.com/"&gt;https://toddbirchard.com/&lt;/a&gt;, and Twitter account &lt;a href="https://twitter.com/ToddRBirchard"&gt;https://twitter.com/ToddRBirchard&lt;/a&gt;. This undoubtedly assists search engines in making associations between entities on the web. Still, a keen imagination may easily recognize the opportunity to leverage these strong associations to dox or harass strangers on the internet quite easily.&lt;/p&gt;
&lt;h2&gt;
  
  
  Scrape Something Together
&lt;/h2&gt;

&lt;p&gt;Along with Extruct, we'll be installing our good friend &lt;strong&gt;requests&lt;/strong&gt; to fetch pages for us:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;pip3 &lt;span class="nb"&gt;install &lt;/span&gt;requests extruct
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Install libraries




&lt;p&gt;You already know the drill — pick a single URL for now and loot them for all they've got by returning &lt;code&gt;.text&lt;/code&gt; from our request's response:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;requests&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_html&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="s"&gt;"""Get raw HTML from a URL."""&lt;/span&gt;
    &lt;span class="n"&gt;headers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="s"&gt;'Access-Control-Allow-Origin'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'*'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;'Access-Control-Allow-Methods'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'GET'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;'Access-Control-Allow-Headers'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'Content-Type'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;'Access-Control-Max-Age'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'3600'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;'User-Agent'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0'&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;req&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Retrieve a page's HTML.




&lt;p&gt;Simple stuff. Here's where extruct comes in; I'm tossing together a function called &lt;strong&gt;get_metadata&lt;/strong&gt;, which will do precisely what you'd assume. We can pass raw the HTML we grabbed with &lt;strong&gt;get_html&lt;/strong&gt; and pass it to our new function to pillage:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="s"&gt;"""Fetch structured JSON-LD data from a given URL."""&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;pprint&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pprint&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;requests&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;extruct&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;w3lib.html&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;get_base_url&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;scrape&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="s"&gt;"""Parse structured data from a target page."""&lt;/span&gt;
    &lt;span class="n"&gt;html&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_html&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;metadata&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_metadata&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;pprint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;indent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;150&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_metadata&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bytes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="s"&gt;"""Fetch JSON-LD structured data."""&lt;/span&gt;
    &lt;span class="n"&gt;metadata&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;extruct&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;extract&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;get_base_url&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;syntaxes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'json-ld'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;uniform&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;
    &lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="s"&gt;'json-ld'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;metadata&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Getting structured data with extruct.




&lt;p&gt;Using extruct is as easy as passing raw HTML as a string and a site's "base URL" with  &lt;code&gt;extruct.extract(html, base_url=url)&lt;/code&gt;. A "base URL" refers to a site's entry point (or homepage, whatever) for the targeted page. The page you're on right now is &lt;a href="https://hackersandslackers.com/scrape-metadata-json-ld/"&gt;https://hackersandslackers.com/scrape-metadata-json-ld/&lt;/a&gt;. Thus the base URL, in this case, would be &lt;a href="https://hackersandslackers.com/scrape-metadata-json-ld/"&gt;https://hackersandslackers.com/&lt;/a&gt;. There's a core library called &lt;strong&gt;w3lib&lt;/strong&gt; that has a function to handle this exact task, hence our usage of &lt;code&gt;base_url=get_base_url(html, url)&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This is what our extract function returns, using one of my posts as an example:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'@context':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'https://schema.org/'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;'@type':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'Article'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;'author':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'@type':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'Person'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
              &lt;/span&gt;&lt;span class="err"&gt;'image':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'https://hackersandslackers-cdn.storage.googleapis.com/&lt;/span&gt;&lt;span class="mi"&gt;2020&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;04&lt;/span&gt;&lt;span class="err"&gt;/todd@&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="err"&gt;x.jpg'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
              &lt;/span&gt;&lt;span class="err"&gt;'name':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'Todd&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Birchard'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
              &lt;/span&gt;&lt;span class="err"&gt;'sameAs':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="err"&gt;'https://toddbirchard.com'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'https://twitter.com/ToddRBirchard'&lt;/span&gt;&lt;span class="p"&gt;]},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;'dateModified':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="mi"&gt;2020-06-11&lt;/span&gt;&lt;span class="err"&gt;T&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;57&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;&lt;span class="mf"&gt;57.000&lt;/span&gt;&lt;span class="mi"&gt;-04&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;00&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;'datePublished':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="mi"&gt;2018-11-11&lt;/span&gt;&lt;span class="err"&gt;T&lt;/span&gt;&lt;span class="mi"&gt;08&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;35&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="mf"&gt;9.000&lt;/span&gt;&lt;span class="mi"&gt;-05&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;00&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;'description':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Use Python's BeautifulSoup library to assist in the honest act of systematically stealing data without permission."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;'headline':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'Scraping&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Data&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;on&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Web&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;with&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;BeautifulSoup'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;'image':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'@type':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'ImageObject'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
             &lt;/span&gt;&lt;span class="err"&gt;'height':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="mi"&gt;523&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
             &lt;/span&gt;&lt;span class="err"&gt;'url':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'https://hackersandslackers-cdn.storage.googleapis.com/&lt;/span&gt;&lt;span class="mi"&gt;2020&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;06&lt;/span&gt;&lt;span class="err"&gt;/beautifulsoup&lt;/span&gt;&lt;span class="mi"&gt;-1-1&lt;/span&gt;&lt;span class="err"&gt;.jpg'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
             &lt;/span&gt;&lt;span class="err"&gt;'width':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;'keywords':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'Python&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Data&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Engineering'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;'mainEntityOfPage':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;'@id':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'https://hackersandslackers.com'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'@type':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'WebPage'&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;'publisher':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'@type':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'Organization'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                 &lt;/span&gt;&lt;span class="err"&gt;'founder':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'Todd&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Birchard'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                 &lt;/span&gt;&lt;span class="err"&gt;'logo':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'@type':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'ImageObject'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                           &lt;/span&gt;&lt;span class="err"&gt;'height':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                           &lt;/span&gt;&lt;span class="err"&gt;'url':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'https://hackersandslackers-cdn.storage.googleapis.com/&lt;/span&gt;&lt;span class="mi"&gt;2020&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;03&lt;/span&gt;&lt;span class="err"&gt;/logo-blue-full.png'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                           &lt;/span&gt;&lt;span class="err"&gt;'width':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
                 &lt;/span&gt;&lt;span class="err"&gt;'name':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'Hackers&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Slackers'&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="err"&gt;'url':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'https://hackersandslackers.com/scraping-urls-with-beautifulsoup/'&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
JSON-LD data for a Hackers and Slackers post.




&lt;p&gt;One of the keyword arguments we passed to extruct was &lt;strong&gt;syntaxes&lt;/strong&gt;, which is an optional argument where we specify which &lt;em&gt;flavor&lt;/em&gt; of structured data we're after (apparently there's more than one). Possible options to pass are &lt;code&gt;'microdata'&lt;/code&gt;, &lt;code&gt;'json-ld'&lt;/code&gt;, &lt;code&gt;'opengraph'&lt;/code&gt;, &lt;code&gt;'microformat'&lt;/code&gt;, and &lt;code&gt;'rdfa'&lt;/code&gt;. If nothing is passed, extruct will attempt to fetch &lt;em&gt;all of the above&lt;/em&gt; and return the results in a dictionary. This is why we follow up our extruct call by accessing the &lt;code&gt;['json-ld']&lt;/code&gt; key.&lt;/p&gt;
&lt;h3&gt;
  
  
  Dealing with Inconsistent Results
&lt;/h3&gt;

&lt;p&gt;You're might be wondering why we index &lt;code&gt;[0]&lt;/code&gt; after getting our results from extruct. This is a symptom of structured data: where traditional &lt;code&gt;&amp;lt;meta&amp;gt;&lt;/code&gt; tags are predictably 1-dimensional, the "structure" of structured data is flexible and determined by developers. This level of flexibility gives developers the power to do things like define multiple meta images as a site's share image as a list of dicts as opposed to a single dict. This means makes the output of any given site's data unpredictable, which poses problems for Python scripts which are unaware of whether they should searching a list index or accessing a dictionary value.&lt;/p&gt;

&lt;p&gt;The way I handle this is by explicitly checking the Python type of data being returned before extracting it:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;render_json_ltd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="s"&gt;"""Fetch JSON-LD structured data."""&lt;/span&gt;
    &lt;span class="n"&gt;metadata&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;extruct&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;extract&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;get_domain&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;syntaxes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'json-ld'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;uniform&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;
    &lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="s"&gt;'json-ld'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;metadata&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Check the "type" of structured data




&lt;p&gt;This uncertainly of returned data types occurs everywhere. In the example where a page may have  multiple meta images, I might write a function like &lt;code&gt;get_image()&lt;/code&gt; below, where I explicitly check the type of data being returned for a given attribute while transversing the data tree:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_image&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parsed_metadata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="s"&gt;"""Scrape parsed_metadata `share image`."""&lt;/span&gt;
    &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_data&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;_data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'image'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Extract data depending on type



&lt;h3&gt;
  
  
  Put it to Work
&lt;/h3&gt;

&lt;p&gt;A script to return fetch and return structured data from a site would look something like this:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="s"&gt;"""Fetch structured JSON-LD data from a given URL."""&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;pprint&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pprint&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;requests&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;extruct&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;w3lib.html&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;get_base_url&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;scrape&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="s"&gt;"""Parse structured data from a target page."""&lt;/span&gt;
    &lt;span class="n"&gt;html&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_html&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;metadata&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_metadata&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;pprint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;indent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;150&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_html&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="s"&gt;"""Get raw HTML from a URL."""&lt;/span&gt;
    &lt;span class="n"&gt;headers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="s"&gt;'Access-Control-Allow-Origin'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'*'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;'Access-Control-Allow-Methods'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'GET'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;'Access-Control-Allow-Headers'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'Content-Type'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;'Access-Control-Max-Age'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'3600'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;'User-Agent'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0'&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;req&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_metadata&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bytes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="s"&gt;"""Fetch JSON-LD structured data."""&lt;/span&gt;
    &lt;span class="n"&gt;metadata&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;extruct&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;extract&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;get_base_url&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;syntaxes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'json-ld'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;uniform&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;
    &lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="s"&gt;'json-ld'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;metadata&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Scrape a single page for structured data.




&lt;p&gt;Testing our Scraper&lt;br&gt;
Since we're grownups, it's best if we write a simple test or two for a script that could potentially be run on a massive scale. The bare minimum we could do is point our scraper to a site containing structured data and compare the output to the data we'd expect to see. Below is a small test written with Pytest to see that our scrape() function outputs data which matches a hardcoded copy of what I expect to get back:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="s"&gt;"""Validate JSON-LD Scrape outcome."""&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;pytest&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;extruct_tutorial&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;scrape&lt;/span&gt;


&lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;pytest&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fixture&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;url&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="s"&gt;"""Target URL to scrape metadata."""&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;'https://hackersandslackers.com/creating-django-views/'&lt;/span&gt;


&lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;pytest&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fixture&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;expected_json&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="s"&gt;"""Expected metadata to be returned."""&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;'@context'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'https://schema.org/'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'@type'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'Article'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;'author'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;'@type'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'Person'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'name'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'Todd Birchard'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                       &lt;span class="s"&gt;'image'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'https://hackersandslackers-cdn.storage.googleapis.com/2020/04/todd@2x.jpg'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                       &lt;span class="s"&gt;'sameAs'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'https://toddbirchard.com'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'https://twitter.com/ToddRBirchard'&lt;/span&gt;&lt;span class="p"&gt;]},&lt;/span&gt;
            &lt;span class="s"&gt;'keywords'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'Django, Python, Software Development'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'headline'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'Creating Interactive Views in Django'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;'url'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'https://hackersandslackers.com/creating-django-views/'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;'datePublished'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'2020-04-23T12:21:00.000-04:00'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'dateModified'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'2020-05-02T13:31:33.000-04:00'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;'image'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;'@type'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'ImageObject'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                      &lt;span class="s"&gt;'url'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'https://hackersandslackers-cdn.storage.googleapis.com/2020/04/django-views-1.jpg'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                      &lt;span class="s"&gt;'width'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'1000'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'height'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'523'&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="s"&gt;'publisher'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;'@type'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'Organization'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'name'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'Hackers and Slackers'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'founder'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'Todd Birchard'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                          &lt;span class="s"&gt;'logo'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;'@type'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'ImageObject'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                                   &lt;span class="s"&gt;'url'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'https://hackersandslackers-cdn.storage.googleapis.com/2020/03/logo-blue-full.png'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                                   &lt;span class="s"&gt;'width'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'height'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;}},&lt;/span&gt;
            &lt;span class="s"&gt;'description'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'Create interactive user experiences by writing Django views to handle dynamic content, submitting forms, and interacting with data.'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;'mainEntityOfPage'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;'@type'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'WebPage'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'@id'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'https://hackersandslackers.com'&lt;/span&gt;&lt;span class="p"&gt;}}&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_scrape&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;expected_json&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="s"&gt;"""Match scrape's fetched metadata to known value."""&lt;/span&gt;
    &lt;span class="n"&gt;metadata&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;scrape&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;expected_json&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
test_scrape.py



&lt;h2&gt;
  
  
  Build a Metadata Scraper
&lt;/h2&gt;

&lt;p&gt;Of course, &lt;code&gt;scrape()&lt;/code&gt; simply puts data on a silver platter for you - there's still the work of grabbing the values. To give you an example of a fully fleshed-out script to scrape metadata with extruct, I'll share with you my own personal treasure: the script I use to generate link previews:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="s"&gt;"""Fetch structured JSON-LD data from a given URL."""&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;requests&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;extruct&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;w3lib.html&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;get_base_url&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;scrape&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;]]:&lt;/span&gt;
    &lt;span class="s"&gt;"""Parse structured data from a URL."""&lt;/span&gt;
    &lt;span class="n"&gt;req&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;http_headers&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;base_url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_base_url&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;json_ld&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;render_json_ltd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;card&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"bookmark"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="s"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"bookmark"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="s"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;get_canonical&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="s"&gt;"metadata"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="s"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;get_canonical&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                    &lt;span class="s"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;get_title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                    &lt;span class="s"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;get_description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                    &lt;span class="s"&gt;"author"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;get_author&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                    &lt;span class="s"&gt;"publisher"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;get_publisher&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                    &lt;span class="s"&gt;"thumbnail"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;get_image&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                    &lt;span class="p"&gt;}&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;card&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_html&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="s"&gt;"""Get raw HTML from a URL."""&lt;/span&gt;
    &lt;span class="n"&gt;headers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="s"&gt;'Access-Control-Allow-Origin'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'*'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;'Access-Control-Allow-Methods'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'GET'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;'Access-Control-Allow-Headers'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'Content-Type'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;'Access-Control-Max-Age'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'3600'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;'User-Agent'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0'&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;req&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_metadata&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="s"&gt;"""Fetch JSON-LD structured data."""&lt;/span&gt;
    &lt;span class="n"&gt;metadata&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;extruct&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;extract&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;get_domain&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;syntaxes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'json-ld'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;uniform&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;
    &lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="s"&gt;'json-ld'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;metadata&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="s"&gt;"""Fetch title via extruct."""&lt;/span&gt;
    &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'headline'&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'headline'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'headline'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'headline'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'headline'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"'"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'title'&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'title'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'title'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'title'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'title'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_image&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="s"&gt;"""Fetch share image via extruct."""&lt;/span&gt;
    &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'image'&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'image'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'image'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'url'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;
        &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'image'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'image'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'url'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
     &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="s"&gt;"""Fetch description via extruct."""&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'description'&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'description'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_author&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;BeautifulSoup&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="s"&gt;"""Fetch author name via extruct with BeautifulSoup fallback."""&lt;/span&gt;
    &lt;span class="n"&gt;author&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'author'&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'author'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;author&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'author'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'name'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'author'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;author&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'author'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'name'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;author&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_publisher&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="s"&gt;"""Fetch publisher name via extruct."""&lt;/span&gt;
    &lt;span class="n"&gt;publisher&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'publisher'&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'publisher'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;publisher&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'publisher'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'name'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'publisher'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;publisher&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'publisher'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'name'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;publisher&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_canonical&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="s"&gt;"""Fetch canonical URL via extruct."""&lt;/span&gt;
    &lt;span class="n"&gt;canonical&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'mainEntityOfPage'&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'mainEntityOfPage'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;canonical&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'mainEntityOfPage'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'@id'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="nb"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'mainEntityOfPage'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;json_ld&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'mainEntityOfPage'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;canonical&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Metadata scraper with extruct



&lt;h2&gt;
  
  
  One More For the Toolbox
&lt;/h2&gt;

&lt;p&gt;Unless you're &lt;em&gt;actually&lt;/em&gt; looking to create link previews like the one I included, using extruct as a standalone library without a more extensive plan or toolkit isn't going to deliver much to you other than an easy interface for getting better metadata from individual web pages. Instead, consider looking at the bigger picture of what a single page's metadata gives us. We now have effortless access to information that crawlers can use to move through sites, associate data with individuals, and ultimately create a picture of an entity's entire web presence, whether that entity is a person, organization, or whatever.&lt;/p&gt;

&lt;p&gt;If you look closely, one of &lt;strong&gt;extruct&lt;/strong&gt;'s main dependencies is actually &lt;strong&gt;BeautifulSoup&lt;/strong&gt;. You could argue that you may have been able to write this library yourself, and you might be right, but that isn't the point. Data mining behemoths aren't nuclear arsenals; they're collections of tools used in conjunction cleverly to wreak havoc upon the world as efficiently as possible. We're getting there.&lt;/p&gt;

&lt;p&gt;This has been a quick little script, but if you're interested I've thrown the source up on Github here:&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--vJ70wriM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://practicaldev-herokuapp-com.freetls.fastly.net/assets/github-logo-ba8488d21cd8ee1fee097b8410db9deaa41d0ca30b004c0c63de0a479114156f.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/hackersandslackers"&gt;
        hackersandslackers
      &lt;/a&gt; / &lt;a href="https://github.com/hackersandslackers/jsonld-scraper-tutorial"&gt;
        jsonld-scraper-tutorial
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      🌎 🖥 Supercharge your scraper to extract quality page metadata by parsing JSON-LD data via Python's extruct library.
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;h1&gt;
Structured Data Scraping Tutorial&lt;/h1&gt;
&lt;p&gt;&lt;a rel="noopener noreferrer" href="https://camo.githubusercontent.com/81ce3b893a5758f5989827bc3c48e6f91e6b5884/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f507974686f6e2d76253545332e382d626c75652e7376673f6c6f676f3d707974686f6e266c6f6e6743616368653d74727565266c6f676f436f6c6f723d776869746526636f6c6f72423d356538316163267374796c653d666c61742d73717561726526636f6c6f72413d346335363661"&gt;&lt;img src="https://camo.githubusercontent.com/81ce3b893a5758f5989827bc3c48e6f91e6b5884/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f507974686f6e2d76253545332e382d626c75652e7376673f6c6f676f3d707974686f6e266c6f6e6743616368653d74727565266c6f676f436f6c6f723d776869746526636f6c6f72423d356538316163267374796c653d666c61742d73717561726526636f6c6f72413d346335363661" alt="Python"&gt;&lt;/a&gt;
&lt;a rel="noopener noreferrer" href="https://camo.githubusercontent.com/fe7e1acb5911ae24a90ee52c6f0fc434986e3cc9/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f457874727563742d76302e392e302d626c75652e7376673f6c6f6e6743616368653d74727565266c6f676f3d666c61736b267374796c653d666c61742d737175617265266c6f676f436f6c6f723d776869746526636f6c6f72423d35653831616326636f6c6f72413d346335363661"&gt;&lt;img src="https://camo.githubusercontent.com/fe7e1acb5911ae24a90ee52c6f0fc434986e3cc9/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f457874727563742d76302e392e302d626c75652e7376673f6c6f6e6743616368653d74727565266c6f676f3d666c61736b267374796c653d666c61742d737175617265266c6f676f436f6c6f723d776869746526636f6c6f72423d35653831616326636f6c6f72413d346335363661" alt="Extruct"&gt;&lt;/a&gt;
&lt;a rel="noopener noreferrer" href="https://camo.githubusercontent.com/e5e554304142fe04ea0f36af7a23287a728802bd/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f52657175657374732d76322e32342e302d626c75652e7376673f6c6f6e6743616368653d74727565266c6f676f3d666c61736b267374796c653d666c61742d737175617265266c6f676f436f6c6f723d776869746526636f6c6f72423d35653831616326636f6c6f72413d346335363661"&gt;&lt;img src="https://camo.githubusercontent.com/e5e554304142fe04ea0f36af7a23287a728802bd/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f52657175657374732d76322e32342e302d626c75652e7376673f6c6f6e6743616368653d74727565266c6f676f3d666c61736b267374796c653d666c61742d737175617265266c6f676f436f6c6f723d776869746526636f6c6f72423d35653831616326636f6c6f72413d346335363661" alt="Requests"&gt;&lt;/a&gt;
&lt;a rel="noopener noreferrer" href="https://camo.githubusercontent.com/1b19c794c686895f24227b6a35069ba11f893af1/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f6c6173742d636f6d6d69742f676f6f676c652f736b69612e7376673f7374796c653d666c61742d73717561726526636f6c6f72413d34633536366126636f6c6f72423d613362653863266c6f676f3d476974487562"&gt;&lt;img src="https://camo.githubusercontent.com/1b19c794c686895f24227b6a35069ba11f893af1/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f6c6173742d636f6d6d69742f676f6f676c652f736b69612e7376673f7374796c653d666c61742d73717561726526636f6c6f72413d34633536366126636f6c6f72423d613362653863266c6f676f3d476974487562" alt="GitHub Last Commit"&gt;&lt;/a&gt;
&lt;a href="https://github.com/hackersandslackers/jsonld-scraper-tutorial/issues"&gt;&lt;img src="https://camo.githubusercontent.com/5bd62336f186600b8e8a510d07e1f77904844010/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f6973737565732f6861636b657273616e64736c61636b6572732f6a736f6e6c642d736372617065722d7475746f7269616c2e7376673f7374796c653d666c61742d73717561726526636f6c6f72413d346335363661266c6f676f3d47697448756226636f6c6f72423d656263623862" alt="GitHub Issues"&gt;&lt;/a&gt;
&lt;a href="https://github.com/hackersandslackers/jsonld-scraper-tutorial/stargazers"&gt;&lt;img src="https://camo.githubusercontent.com/3345d5dba056d3c14d66695ea7be72e67fbec3f9/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f73746172732f6861636b657273616e64736c61636b6572732f6a736f6e6c642d736372617065722d7475746f7269616c2e7376673f7374796c653d666c61742d73717561726526636f6c6f72413d346335363661266c6f676f3d47697448756226636f6c6f72423d656263623862" alt="GitHub Stars"&gt;&lt;/a&gt;
&lt;a href="https://github.com/hackersandslackers/jsonld-scraper-tutorial/network"&gt;&lt;img src="https://camo.githubusercontent.com/7feba46523e7034ee5896be32543a1106a47adb9/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f666f726b732f6861636b657273616e64736c61636b6572732f6a736f6e6c642d736372617065722d7475746f7269616c2e7376673f7374796c653d666c61742d73717561726526636f6c6f72413d346335363661266c6f676f3d47697448756226636f6c6f72423d656263623862" alt="GitHub Forks"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a rel="noopener noreferrer" href="https://raw.githubusercontent.com/hackersandslackers/jsonld-scraper-tutorial/master/.github/json-ld-pyld-1@2x.jpg"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--ej1Aiujn--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://raw.githubusercontent.com/hackersandslackers/jsonld-scraper-tutorial/master/.github/json-ld-pyld-1%402x.jpg" alt="Extruct Tutorial"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Supercharge your scraper to extract quality page metadata by parsing JSON-LD data via Python's &lt;a href="https://github.com/scrapinghub/extruct"&gt;extruct&lt;/a&gt; library.&lt;/p&gt;
&lt;p&gt;This repository contains source code for the accompanying tutorial on Hackers and Slackers: &lt;a href="https://hackersandslackers.com/scrape-metadata-json-ld/" rel="nofollow"&gt;https://hackersandslackers.com/scrape-metadata-json-ld/&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
Installation&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Installation via &lt;code&gt;requirements.txt&lt;/code&gt;&lt;/strong&gt;:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;$ git clone https://github.com/hackersandslackers/jsonld-scraper-tutorial.git
$ &lt;span class="pl-c1"&gt;cd&lt;/span&gt; jsonld-scraper-tutorial
$ python3 -m venv myenv
$ &lt;span class="pl-c1"&gt;source&lt;/span&gt; myenv/bin/activate
$ pip3 install -r requirements.txt
$ python3 main.py&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Installation via &lt;a href="https://pipenv-fork.readthedocs.io/en/latest/" rel="nofollow"&gt;Pipenv&lt;/a&gt;&lt;/strong&gt;:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;$ git clone https://github.com/hackersandslackers/jsonld-scraper-tutorial.git
$ &lt;span class="pl-c1"&gt;cd&lt;/span&gt; jsonld-scraper-tutorial
$ pipenv shell
$ pipenv update
$ python3 main.py&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Installation via &lt;a href="https://python-poetry.org/" rel="nofollow"&gt;Poetry&lt;/a&gt;&lt;/strong&gt;:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;$ git clone https://github.com/hackersandslackers/jsonld-scraper-tutorial.git
$ &lt;span class="pl-c1"&gt;cd&lt;/span&gt; jsonld-scraper-tutorial
$ poetry shell
$ poetry update
$ poetry run&lt;/pre&gt;&lt;/div&gt;
&lt;h2&gt;
Usage&lt;/h2&gt;
&lt;p&gt;To change the URL targeted by this script, update the &lt;code&gt;URL&lt;/code&gt; variable in &lt;strong&gt;config.py&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hackers and Slackers&lt;/strong&gt; tutorials are free of charge. If you found this tutorial helpful, a &lt;a href="https://www.buymeacoffee.com/hackersslackers" rel="nofollow"&gt;small donation&lt;/a&gt; would be greatly appreciated to keep us in business. All proceeds go towards coffee, and…&lt;/p&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/hackersandslackers/jsonld-scraper-tutorial"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;



&lt;p&gt;Until next time.&lt;/p&gt;

</description>
      <category>python</category>
      <category>scraping</category>
      <category>dataengineering</category>
      <category>scraper</category>
    </item>
    <item>
      <title>Deploy Serverless Golang Functions with Netlify</title>
      <dc:creator>Todd Birchard</dc:creator>
      <pubDate>Mon, 03 Aug 2020 13:09:00 +0000</pubDate>
      <link>https://dev.to/hackersandslackers/deploy-serverless-golang-functions-with-netlify-4m3e</link>
      <guid>https://dev.to/hackersandslackers/deploy-serverless-golang-functions-with-netlify-4m3e</guid>
      <description>&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--4BZ0Rbx_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2020/08/netlify-lambda-go%402x.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--4BZ0Rbx_--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2020/08/netlify-lambda-go%402x.jpg" alt="Deploy Serverless Golang Functions with Netlify"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The race to monetize static-site hype was over before it began — if you're a JAMStack developer, you're a Netlify customer. Surprisingly, the outcome of this unintentional vendor lock-in has been working out pretty well. JAMStack's paradigm of webhook-driven actions rewrites the narrative of static sites as dynamic entities. By either luck or foresight, Netlify is comfortably positioned to provide a home for static-sites as the only cloud host to focus solely on this model of development by providing features, like serverless functions.&lt;/p&gt;

&lt;p&gt;Any market which finds itself dominated by a single vendor is bound to have some downsides. In Netlify's case, one such violation comes in the form of occasional poor documentation, &lt;em&gt;especially&lt;/em&gt; regarding Golang function deployment. Netlify has dedicated precisely &lt;a href="https://docs.netlify.com/functions/build-with-go/"&gt;one page of documentation&lt;/a&gt; dedicated to writing serverless functions in Go with zero mention of &lt;em&gt;how to actually deploy said functions&lt;/em&gt;. Compare this to Netlify's commitment to JavaScript function development, which includes a &lt;a href="https://github.com/netlify/cli/blob/master/docs/netlify-dev.md#netlify-functions"&gt;dedicated CLI&lt;/a&gt;, a &lt;a href="https://github.com/netlify/netlify-lambda"&gt;build plugin&lt;/a&gt;, and &lt;a href="https://docs.netlify.com/functions/build-with-javascript/#format"&gt;documentation&lt;/a&gt; that covers useful details.&lt;/p&gt;

&lt;p&gt;This tutorial assumes you have a basic understanding of Golang and GatsbyJS —this post is &lt;em&gt;not&lt;/em&gt; a tutorial about learning Go from scratch. We &lt;em&gt;will&lt;/em&gt; cover:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The basics of Go development in the context of Lambda functions.&lt;/li&gt;
&lt;li&gt;Deploying Go functions as part of a GatsbyJS site hosted on Netlify.&lt;/li&gt;
&lt;li&gt;Interacting with and monitoring deployed Go functions.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Creating a Go Serverless Function
&lt;/h2&gt;

&lt;p&gt;We'll start by creating a new Go project on our GOPATH as we normally would:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;mkdir&lt;/span&gt; ~/go/src/github.com/toddbirchard/netlify-serverless-tutorial
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; ~/go/src/github.com/toddbirchard/netlify-serverless-tutorial
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Create a directory to store your function source.




&lt;p&gt;Next, we initialize our module by running &lt;code&gt;go mod init&lt;/code&gt; in the current folder, which should produce the following outcome:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;module github.com/toddbirchard/netlify-serverless-tutorial

go 1.14
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
go.mod




&lt;p&gt;Naming your Go module to match the URL of a Github repository is a common practice you should already abide by. Still, it's &lt;em&gt;especially&lt;/em&gt; important to do so for our purposes. Netlify creates serverless functions by taking the Go source code we give and building executable binaries. To do so, Netlify needs to be aware of the dependencies our project might have. Instead of uploading our Go source with all its dependencies, Netlify intelligently looks to the dependencies found in our module's Github repository and pulls them itself.&lt;/p&gt;

&lt;p&gt;Get started by pasting the minimum boilerplate into a new file called &lt;strong&gt;main.go&lt;/strong&gt; :&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/aws/aws-lambda-go/events"&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/aws/aws-lambda-go/lambda"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;Handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;APIGatewayProxyRequest&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;events&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;APIGatewayProxyResponse&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;events&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;APIGatewayProxyResponse&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;StatusCode&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="m"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;Body&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"Hello World!"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// Initiate AWS Lambda handler&lt;/span&gt;
    &lt;span class="n"&gt;lambda&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Handler&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
main.go




&lt;p&gt;Oh snap, we've imported an external library! The &lt;strong&gt;aws-lambda-go&lt;/strong&gt; library is essential for our function to work. Get and install the library by running &lt;code&gt;go get&lt;/code&gt; in your project directory. This will update your &lt;strong&gt;go.mod&lt;/strong&gt; file accordingly:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;module github.com/toddbirchard/netlify-serverless-tutorial

go 1.14

require github.com/aws/aws-lambda-go v1.18.0
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
go.mod



&lt;h3&gt;
  
  
  Working with Request Data
&lt;/h3&gt;

&lt;p&gt;Netlify Lambda functions take the form of HTTP endpoints, so odds are you're building something that takes the input of a request (whether it be a request's params or body) to produce a response. We can access information about an incoming request via &lt;strong&gt;aws-lambda-go&lt;/strong&gt;'s &lt;code&gt;APIGatewayProxyRequest&lt;/code&gt; struct: Lambda's default parameter passed to our handler.&lt;/p&gt;

&lt;p&gt;Let's see this in action by grabbing a query string parameter from an incoming request and outputting the result. We can easily modify our original handler to check an incoming request for a parameter called &lt;code&gt;name&lt;/code&gt; (i.e., &lt;code&gt;example.com/function?name=todd&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="o"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;Handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;APIGatewayProxyRequest&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;events&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;APIGatewayProxyResponse&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;QueryStringParameters&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Hello %s!"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;events&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;APIGatewayProxyResponse&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;StatusCode&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="m"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;Headers&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;map&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;"Content-Type"&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"text/html; charset=UTF-8"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="n"&gt;Body&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="o"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
main.go




&lt;p&gt;&lt;code&gt;APIGatewayProxyRequest&lt;/code&gt; has a variable called &lt;code&gt;QueryStringParameters&lt;/code&gt;, which is a key/value &lt;strong&gt;map&lt;/strong&gt; of query sting names and their values. The above attempts to extract the value of a query string called &lt;strong&gt;name&lt;/strong&gt; to create a cheerful response for the user. If an incoming request were not to contain a query string called &lt;strong&gt;name&lt;/strong&gt;, the value would return &lt;code&gt;nil&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;We could extract a lot more from a &lt;code&gt;APIGatewayProxyRequest&lt;/code&gt; struct if we so chose:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Path&lt;/strong&gt; : URL requested by the user which generated the request.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HTTPMethod&lt;/strong&gt; : &lt;code&gt;GET&lt;/code&gt;, &lt;code&gt;POST&lt;/code&gt;, &lt;code&gt;PUT&lt;/code&gt;, etc.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Headers&lt;/strong&gt; &amp;amp; &lt;strong&gt;MultiValueHeaders&lt;/strong&gt; : HTTP headers sent by the user's request.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;QueryStringParameters&lt;/strong&gt; &amp;amp; &lt;strong&gt;QueryStringParameters&lt;/strong&gt; : Query strings with single or multiple values.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Body&lt;/strong&gt; : Data (such as JSON) sent in the body of the request.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The output of a Lambda function is created by returning &lt;code&gt;APIGatewayProxyResponse&lt;/code&gt;. Our response returns a simple &lt;strong&gt;200&lt;/strong&gt; status code, along with a text response reading &lt;code&gt;"Hello {name}!"&lt;/code&gt;. We specify the content type of the response via our response's &lt;code&gt;Headers&lt;/code&gt; variable, which accepts a map of headers such as &lt;code&gt;Content-Type&lt;/code&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  Testing our Function
&lt;/h3&gt;

&lt;p&gt;I promised to spend minimal time on the actual Golang aspect of this post, but writing unit tests for Lambda functions is particularly tricky as it involves forming fake requests and evaluating their responses. &lt;strong&gt;aws-lambda-go&lt;/strong&gt; has a recommended syntax for writing multiple tests against a single handler which I found quite useful:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;main_test&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/aws/aws-lambda-go/events"&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/stretchr/testify/assert"&lt;/span&gt;
    &lt;span class="n"&gt;main&lt;/span&gt; &lt;span class="s"&gt;"github.com/toddbirchard/netlify-serverless-tutorial"&lt;/span&gt;
    &lt;span class="s"&gt;"log"&lt;/span&gt;
    &lt;span class="s"&gt;"testing"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;TestHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;testing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;tests&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;request&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;APIGatewayProxyRequest&lt;/span&gt;
        &lt;span class="n"&gt;expect&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
        &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="c"&gt;// Test name has value&lt;/span&gt;
            &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;APIGatewayProxyRequest&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;QueryStringParameters&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="k"&gt;map&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;"name"&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"Paul"&lt;/span&gt;&lt;span class="p"&gt;}},&lt;/span&gt;
            &lt;span class="n"&gt;expect&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"Hello Paul!"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="c"&gt;// Test name is null&lt;/span&gt;
            &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;APIGatewayProxyRequest&lt;/span&gt;&lt;span class="p"&gt;{},&lt;/span&gt;
            &lt;span class="n"&gt;expect&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"Hello !"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;test&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;tests&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;test&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;assert&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;IsType&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;test&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;assert&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Equal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;test&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Body&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Test %d: %s"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Body&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
main_test.go




&lt;p&gt;&lt;code&gt;TestHandler&lt;/code&gt; creates two hypothetical situations (aka tests) to run against our &lt;code&gt;Handler&lt;/code&gt; function. The first test passes a query string parameter &lt;code&gt;"Paul"&lt;/code&gt;, while the second passes none at all. These scenarios are then evaluated in a loop, which outputs the following when tested with &lt;code&gt;go test&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;2020/08/01 03:17:47 Test 0: Hello Paul!
2020/08/01 03:17:47 Test 1: Hello &lt;span class="o"&gt;!&lt;/span&gt;
PASS
ok github.com/toddbirchard/netlify-serverless-tutorial 0.139s
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
main_test.go output




&lt;p&gt;The validity of each test is validated via assertions belonging to the &lt;a href="https://github.com/stretchr/testify"&gt;testify&lt;/a&gt; module, which I recommend for unit tests like these. Be sure to &lt;code&gt;go get&lt;/code&gt; this module if you chose to include it:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;module github.com/toddbirchard/netlify-serverless-tutorial

go 1.14

require &lt;span class="o"&gt;(&lt;/span&gt;
    github.com/aws/aws-lambda-go v1.18.0
    github.com/stretchr/testify v1.4.0
&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
go.mod




&lt;p&gt;We now have a Lambda function which is complete enough to deploy.... but how do we go about that?&lt;/p&gt;
&lt;h2&gt;
  
  
  Serverless Functions in GatsbyJS
&lt;/h2&gt;

&lt;p&gt;The structure of a minimal GatsbyJS project looks something like this:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;/my-gatsby-project
├── /src
├── /public
├── /static
├── gatsby-browser.js
├── gatsby-config.js
├── gatsby-node.js
└── package.json
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
GatsbyJS project structure.




&lt;p&gt;All of the files and directories above are standard Gatsby stuff doing Gatsby things. Things become a little more interesting when we deploy a Gatsby site with Lambda functions, as our Lambda's source code and binary is going to live side-by-side with our Gatsby code. This takes the form of two directories: a &lt;strong&gt;/functions&lt;/strong&gt; directory and a &lt;strong&gt;/go-src&lt;/strong&gt; directory:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;/my-gatsby-project
├── /functions
| └── helloworld
├── /go-src
| └── /helloworld
| └── main.go
├── /src
├── /public
├── /static
├── gatsby-browser.js
├── gatsby-config.js
├── gatsby-node.js
└── package.json
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
GatsbyJS project structure.




&lt;p&gt;&lt;strong&gt;/go-src&lt;/strong&gt; is where we dump the source code of our Golang functions ( &lt;strong&gt;.go&lt;/strong&gt; files). We create subdirectories in &lt;strong&gt;go-src&lt;/strong&gt; for each Lambda function we want to deploy. Go source code in each subdirectory will be compiled to a binary sharing the name of the directory in the &lt;strong&gt;/functions&lt;/strong&gt; folder. In the above example, the &lt;strong&gt;/helloworld&lt;/strong&gt; subdirectory compiles to a binary in &lt;strong&gt;functions/helloworld&lt;/strong&gt;, which ultimately dictates the eventual URL of our endpoint as well, as such:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;https://hackersandslackers.com/.netlify/functions/helloworld
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
URL of a deployed Golang Lambda function.




&lt;p&gt;If you're wondering how the source of &lt;strong&gt;main.go&lt;/strong&gt; got into our new &lt;strong&gt;/go-src/helloworld&lt;/strong&gt; destination, the underwhelming answer is a simple copy-and-paste. No tricks there.&lt;/p&gt;
&lt;h3&gt;
  
  
  Building Netlify Lambda Binaries in GatsbyJS
&lt;/h3&gt;

&lt;p&gt;Let's recap where we're at:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GatsbyJS project&lt;/strong&gt;: Check ✅&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lambda source code&lt;/strong&gt;: Check ✅&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Binaries built from source upon project builds&lt;/strong&gt;: Not check 🚫&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When we deploy to Netlify, we want our project to find the source in &lt;strong&gt;/go-src&lt;/strong&gt; and build the resulting binaries to the &lt;strong&gt;/functions&lt;/strong&gt; directory. Netlify looks for a build command in &lt;a href="https://docs.netlify.com/configure-builds/file-based-configuration/"&gt;netlify.toml&lt;/a&gt;, which would typically be &lt;code&gt;gatsby build&lt;/code&gt; under normal circumstances. Since we're now building our site &lt;em&gt;and&lt;/em&gt; functions, we're going to recruit the help of a &lt;strong&gt;Makefile&lt;/strong&gt; to enable a more involved build process.&lt;/p&gt;

&lt;p&gt;Make sure you have a &lt;strong&gt;netlify.toml&lt;/strong&gt; file in your GatsbyJS project root where we can specify the build command of our Makefile, as well as define our functions folder:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="nn"&gt;[build]&lt;/span&gt;
  &lt;span class="py"&gt;base&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"/"&lt;/span&gt;
  &lt;span class="py"&gt;command&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"make build"&lt;/span&gt;
  &lt;span class="py"&gt;publish&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"/public"&lt;/span&gt;
  &lt;span class="py"&gt;functions&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"/functions"&lt;/span&gt;

&lt;span class="nn"&gt;[build.environment]&lt;/span&gt;
  &lt;span class="py"&gt;GO_VERSION&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"1.14.5"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
netlify.toml




&lt;p&gt;We've changed what you most likely had as &lt;code&gt;command = "gatsby build"&lt;/code&gt; to &lt;code&gt;command = "make build"&lt;/code&gt;, which looks for a Makefile containing a &lt;strong&gt;build&lt;/strong&gt; command to execute. Let's create that now.&lt;/p&gt;
&lt;h3&gt;
  
  
  Making a Makefile
&lt;/h3&gt;

&lt;p&gt;If you're new to &lt;a href="https://en.wikipedia.org/wiki/Makefile"&gt;Makefiles&lt;/a&gt;, they're a standard fixture in projects that allow us to define one-liners that kick off a series of events, such as building a project. Our Makefile is going to look like this:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight make"&gt;&lt;code&gt;&lt;span class="nl"&gt;build&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
   &lt;span class="err"&gt;npm&lt;/span&gt; &lt;span class="err"&gt;run-script&lt;/span&gt; &lt;span class="err"&gt;build&lt;/span&gt;
   &lt;span class="err"&gt;mkdir&lt;/span&gt; &lt;span class="err"&gt;-p&lt;/span&gt; &lt;span class="err"&gt;functions&lt;/span&gt;
   &lt;span class="nv"&gt;GOOS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;linux
   &lt;span class="nv"&gt;GOARCH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;amd64
   &lt;span class="nv"&gt;GO111MODULE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;on
   &lt;span class="nv"&gt;GOBIN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;${PWD}&lt;/span&gt;/functions go get ./...
   &lt;span class="nv"&gt;GOBIN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;${PWD}&lt;/span&gt;/functions go &lt;span class="nb"&gt;install&lt;/span&gt; ./...
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Makefile




&lt;p&gt;Hopefully, you find this string of commands to be confusing as shit as I did (you might be overqualified for reading this post otherwise). We'll break down this &lt;code&gt;build&lt;/code&gt; command line-by-line:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;npm run-script build&lt;/code&gt;: This command runs a script in &lt;strong&gt;package.json&lt;/strong&gt; labeled as "build," which is likely running &lt;code&gt;gatsby build&lt;/code&gt;. This step is building your Gatsby site as you normally would for production, so nothing special there.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;mkdir -p functions&lt;/code&gt;: Creates a directory called &lt;strong&gt;functions&lt;/strong&gt; in your project's root directory, in case it doesn't exist. Since I don't recommend uploading your compiled binaries to Github (Netlify will rebuild them regardless), there's a good chance this folder won't exist upon committing. As a result, we need this command as a way to create a fresh &lt;strong&gt;functions&lt;/strong&gt; folder on our Netlify server.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;GOOS=linux&lt;/code&gt; &amp;amp; &lt;code&gt;GOARCH=amd64&lt;/code&gt;: Golang's &lt;a href="https://gist.github.com/asukakenji/f15ba7e588ac42795f421b48b8aede63"&gt;&lt;strong&gt;GOOS&lt;/strong&gt; and &lt;strong&gt;GOARCH&lt;/strong&gt;&lt;/a&gt; environment variables tell Go which target operating system (Linux) and architecture (AMD 64-bit) our binaries are intended to be built for. The values we're passing here align with the OS and architecture of Netlify's server, thus we're ensuring Go is targeting the correct infrastructure to compile binaries for. &lt;/li&gt;
&lt;li&gt;
&lt;code&gt;GO111MODULE=on&lt;/code&gt;: Things get a bit interesting here as Go source we're building in &lt;strong&gt;/go-src&lt;/strong&gt; is not actually on our GOPATH, nor will it be upon deployment to Netlify. &lt;a href="https://insujang.github.io/2020-04-04/go-modules/"&gt;GO111MODULE&lt;/a&gt; is a workaround for building Go projects outside of the designated GOPATH; this allows us to get dependencies of functions living in our Gatsby project and build the resulting binaries without worrying about being shackled to our (or Netlify's) GOPATH.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;GOBIN=${PWD}/functions go install ./...&lt;/code&gt;: Lastly, we need to &lt;code&gt;get&lt;/code&gt; and &lt;code&gt;install&lt;/code&gt; all of our function's dependencies to build them into our final binary. This line is effectively looking for all go modules to build into binaries and builds them in the &lt;strong&gt;/functions&lt;/strong&gt; folder.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Give it a Whirl
&lt;/h3&gt;

&lt;p&gt;We're ready to test our function locally! Run &lt;code&gt;make build&lt;/code&gt; in your Gatsby root folder to make sure things run smoothly. If you're successful, you should see an output like this towards the end of your build:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;info Done building &lt;span class="k"&gt;in &lt;/span&gt;63.568185195 sec
&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; functions
&lt;span class="nv"&gt;GOOS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;linux
&lt;span class="nv"&gt;GOARCH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;amd64
&lt;span class="nv"&gt;GO111MODULE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;on
&lt;span class="nv"&gt;GOBIN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/Users/toddbirchard/projects/stockholm/functions go get ./...
&lt;span class="nv"&gt;GOBIN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/Users/toddbirchard/projects/stockholm/functions go &lt;span class="nb"&gt;install&lt;/span&gt; ./...
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Output of &lt;code&gt;make build&lt;/code&gt;




&lt;p&gt;You should now see Go binaries in your local &lt;strong&gt;/functions&lt;/strong&gt; directory! Add your &lt;strong&gt;/functions directory&lt;/strong&gt; to your &lt;code&gt;.gitignore&lt;/code&gt; file and commit that masterpiece.&lt;/p&gt;
&lt;h2&gt;
  
  
  See it in Action
&lt;/h2&gt;

&lt;p&gt;To demonstrate a GatsbyJS site utilizing a Netlify function, I set up a demo for your pleasure &lt;a href="https://serverless-golang-tutorial.netlify.app/"&gt;here&lt;/a&gt;:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--TIdelDRP--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2020/08/netlify-functions-demo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--TIdelDRP--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2020/08/netlify-functions-demo.png" alt="Deploy Serverless Golang Functions with Netlify"&gt;&lt;/a&gt;&lt;a href="https://serverless-golang-tutorial.netlify.app"&gt;https://serverless-golang-tutorial.netlify.app&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is a simple GatsbyJS site that takes the output of our helloworld function and writes it to the page via some very simple frontend JavaScript. I busted some ass to put together a somewhat respectable demo for you, because copying is way easier than reading. Don't say I've never done anything for you:&lt;/p&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--vJ70wriM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://practicaldev-herokuapp-com.freetls.fastly.net/assets/github-logo-ba8488d21cd8ee1fee097b8410db9deaa41d0ca30b004c0c63de0a479114156f.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/hackersandslackers"&gt;
        hackersandslackers
      &lt;/a&gt; / &lt;a href="https://github.com/hackersandslackers/netlify-functions-tutorial"&gt;
        netlify-functions-tutorial
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      Write and deploy Golang Lambda Functions to your GatsbyJS site on Netlify.
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;h1&gt;
Netlify Function Golang Tutorial&lt;/h1&gt;
&lt;p&gt;&lt;a rel="noopener noreferrer" href="https://camo.githubusercontent.com/7d7d2f475a28753405717c9dd4d236b9f96acd76/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4761747362792d76253545322e382d79656c6c6f772e7376673f6c6f6e6743616368653d74727565267374796c653d666c61742d737175617265266c6f676f3d476174736279266c6f676f436f6c6f723d776869746526636f6c6f72413d34633536366126636f6c6f72423d623438656164"&gt;&lt;img src="https://camo.githubusercontent.com/7d7d2f475a28753405717c9dd4d236b9f96acd76/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4761747362792d76253545322e382d79656c6c6f772e7376673f6c6f6e6743616368653d74727565267374796c653d666c61742d737175617265266c6f676f3d476174736279266c6f676f436f6c6f723d776869746526636f6c6f72413d34633536366126636f6c6f72423d623438656164" alt="Gatsby"&gt;&lt;/a&gt;
&lt;a rel="noopener noreferrer" href="https://camo.githubusercontent.com/6ede23afbedc002bf67678964b4d28b7b316029e/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f476f2d312e31342d626c75652e7376673f6c6f676f3d676f266c6f6e6743616368653d74727565266c6f676f436f6c6f723d776869746526636f6c6f72423d383843304430267374796c653d666c61742d73717561726526636f6c6f72413d346335363661"&gt;&lt;img src="https://camo.githubusercontent.com/6ede23afbedc002bf67678964b4d28b7b316029e/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f476f2d312e31342d626c75652e7376673f6c6f676f3d676f266c6f6e6743616368653d74727565266c6f676f436f6c6f723d776869746526636f6c6f72423d383843304430267374796c653d666c61742d73717561726526636f6c6f72413d346335363661" alt="Go"&gt;&lt;/a&gt;
&lt;a rel="noopener noreferrer" href="https://camo.githubusercontent.com/1b19c794c686895f24227b6a35069ba11f893af1/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f6c6173742d636f6d6d69742f676f6f676c652f736b69612e7376673f7374796c653d666c61742d73717561726526636f6c6f72413d34633536366126636f6c6f72423d613362653863266c6f676f3d476974487562"&gt;&lt;img src="https://camo.githubusercontent.com/1b19c794c686895f24227b6a35069ba11f893af1/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f6c6173742d636f6d6d69742f676f6f676c652f736b69612e7376673f7374796c653d666c61742d73717561726526636f6c6f72413d34633536366126636f6c6f72423d613362653863266c6f676f3d476974487562" alt="GitHub Last Commit"&gt;&lt;/a&gt;
&lt;a href="https://github.com/hackersandslackers/netlify-functions-tutorial/issues"&gt;&lt;img src="https://camo.githubusercontent.com/eeff2be2c1dade400166ef84ea81cf12583ea603/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f6973737565732f6861636b657273616e64736c61636b6572732f6e65746c6966792d66756e6374696f6e732d7475746f7269616c2e7376673f7374796c653d666c61742d73717561726526636f6c6f72423d65626362386226636f6c6f72413d346335363661266c6f676f3d476974487562" alt="GitHub issues"&gt;&lt;/a&gt;
&lt;a href="https://github.com/hackersandslackers/netlify-functions-tutorial/stargazers"&gt;&lt;img src="https://camo.githubusercontent.com/87391e8ec39c56f5d00259a2b83cd7e686dc71d0/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f73746172732f6861636b657273616e64736c61636b6572732f6e65746c6966792d66756e6374696f6e732d7475746f7269616c2e7376673f7374796c653d666c61742d73717561726526636f6c6f72423d65626362386226636f6c6f72413d346335363661266c6f676f3d476974487562" alt="GitHub stars"&gt;&lt;/a&gt;
&lt;a href="https://github.com/hackersandslackers/netlify-functions-tutorial/network"&gt;&lt;img src="https://camo.githubusercontent.com/c2b3d7806b2534c1e35f4191652b1afbd675648d/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f666f726b732f6861636b657273616e64736c61636b6572732f6e65746c6966792d66756e6374696f6e732d7475746f7269616c2e7376673f7374796c653d666c61742d73717561726526636f6c6f72413d34633536366126636f6c6f72423d656263623862266c6f676f3d476974487562" alt="GitHub forks"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;An example GatsbyJS project demonstrating how to deploy Lambd functions with Netlify. Source code for the accompanying tutorial found here: &lt;a href="https://hackersandslackers.com/deploy-serverless-golang-functions-with-netlify/" rel="nofollow"&gt;https://hackersandslackers.com/deploy-serverless-golang-functions-with-netlify/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a rel="noopener noreferrer" href="https://raw.githubusercontent.com/hackersandslackers/netlify-functions-tutorial/master/./.github/netlify-lambda-go@2x.jpg"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--q6_IDZBw--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://raw.githubusercontent.com/hackersandslackers/netlify-functions-tutorial/master/./.github/netlify-lambda-go%402x.jpg" alt="Netlify Function Tutorial"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Demo here: &lt;a href="https://serverless-golang-tutorial.netlify.app" rel="nofollow"&gt;https://serverless-golang-tutorial.netlify.app&lt;/a&gt;&lt;/p&gt;
&lt;/div&gt;

  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/hackersandslackers/netlify-functions-tutorial"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;



&lt;h2&gt;
  
  
  Why is this So Convoluted?
&lt;/h2&gt;

&lt;p&gt;Compared to the workflow of deploying JavaScript functions on Netlify, Golang is an undocumented and seemingly half-supported nightmare. Until just this week, inspecting your Go Lambda functions on Netlify's UI would append a &lt;strong&gt;.js&lt;/strong&gt; file extension to every function by default with the presumed assumption that everybody is picking JS over Go:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--reobZHmt--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2020/07/netlify_golang_serverless_dotjs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--reobZHmt--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2020/07/netlify_golang_serverless_dotjs.png" alt="Deploy Serverless Golang Functions with Netlify"&gt;&lt;/a&gt;Netlify doesn't care about Golang™&lt;/p&gt;

&lt;p&gt;This is generally in line with Netlify's actions of half-assing their Go documentation, and even releasing a CLI to assist in function development called &lt;a href="https://www.npmjs.com/package/netlify-lambda"&gt;netlify-lambda&lt;/a&gt;, which &lt;em&gt;only supports JavaScript.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Other than feeling like a second-class citizen, these gaffs are more amusing than they are harmful. Once your functions are deployed, they work as you'd expect. All is well that ends well.&lt;/p&gt;

&lt;p&gt;If you're looking for more resources on Go Lambda development, &lt;a href="https://github.com/aws/aws-lambda-go"&gt;AWS has a Github repo&lt;/a&gt; of examples that might prove useful. Aside from that and this tutorial, I think you have everything you need 😉. Until next time.&lt;/p&gt;

</description>
      <category>jamstack</category>
      <category>go</category>
      <category>gatsby</category>
      <category>softwaredevelopment</category>
    </item>
    <item>
      <title>Connecting Pandas to a Database with SQLAlchemy</title>
      <dc:creator>Todd Birchard</dc:creator>
      <pubDate>Fri, 12 Jun 2020 13:53:00 +0000</pubDate>
      <link>https://dev.to/hackersandslackers/connecting-pandas-to-a-database-with-sqlalchemy-1mnf</link>
      <guid>https://dev.to/hackersandslackers/connecting-pandas-to-a-database-with-sqlalchemy-1mnf</guid>
      <description>&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--RDJ92m8D--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2020/06/pandas-sqlalchemy-1-4.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--RDJ92m8D--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2020/06/pandas-sqlalchemy-1-4.jpg" alt="Connecting Pandas to a Database with SQLAlchemy"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Databases. You love them, you need them, but let's face it... you've already mastered working with them. There's only so much fun to be had in the business of opening database connections, pulling rows, and putting them back where they came from. Wouldn't it be great if we could skip the boring stuff and work with data?&lt;/p&gt;

&lt;p&gt;Pandas and SQLAlchemy are a mach made in Python heaven. They're individually amongst Python's most frequently used libraries. Together they're greater than the sum of their parts, thanks to Pandas' built-in SQLAlchemy integration.&lt;/p&gt;

&lt;h2&gt;
  
  
  Create a SQLAlchemy Connection
&lt;/h2&gt;

&lt;p&gt;As you might imagine, the first two libraries we need to install are Pandas and SQLAlchemy. We need to install a database connector as our third and final library, but the library you need depends on the type of database you'll be connecting to. If you're connecting to MySQL I recommend installing &lt;strong&gt;PyMySQL&lt;/strong&gt; ( &lt;code&gt;pip install pymysql&lt;/code&gt; ). If you're connecting to Postgres, go with &lt;strong&gt;Psycopg2&lt;/strong&gt; ( &lt;code&gt;pip install psycopg2&lt;/code&gt; ). The only time we'll use either of these libraries is when we establish a database connection with SQLAlchemy.&lt;/p&gt;

&lt;h3&gt;
  
  
  SQLAlchemy URIs
&lt;/h3&gt;

&lt;p&gt;A URI (or connection string), is simply a string containing the information needed to connect to something like a database. Here's an example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;postgres+psycopg2://myuser:mypassword@hackersdb.example.com:5432/mydatabase
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Postgres database connection URI





&lt;p&gt;The first part of our string is &lt;code&gt;postgres+psycop2&lt;/code&gt;, which is a combination of our target database type and our connector. If you're connecting to MySQL, replace this with &lt;code&gt;mysql+pymysql&lt;/code&gt;. In case the rest of the URI isn't self-explanatory, here's a breakdown of each piece of this string:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;[&lt;/span&gt;DB_FLAVOR]+[DB_PYTHON_LIBRARY]://[USERNAME]:[PASSWORD]@[DB_HOST]:[PORT]/[DB_NAME]
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
SQLAlchemy URI structure





&lt;h3&gt;
  
  
  SQLAlchemy Engines
&lt;/h3&gt;

&lt;p&gt;An "engine" is an object used to connect to databases using the information in our URI. Once we create an engine, downloading and uploading data is as simple as passing this object to Pandas:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from os import environ
from sqlalchemy import create_engine

db_uri = environ.get('SQLALCHEMY_DATABASE_URI')
self.engine = create_engine(db_uri, echo=True)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Configure SQLAlchemy engine.





&lt;p&gt;Besides accepting a URI, &lt;code&gt;create_engine()&lt;/code&gt; can accept a few optional kwargs as well. I've decided to set &lt;code&gt;echo=True&lt;/code&gt;, which will log every query our SQL database executes to the terminal. If your database requires SSL, you may need to utilize the &lt;code&gt;connect_args&lt;/code&gt; parameter to pass a certificate.&lt;/p&gt;

&lt;p&gt;Believe it or not, we're already done dealing with database setup! From here forward we're able to pull or upload data into Pandas via easy one-liners.&lt;/p&gt;

&lt;h2&gt;
  
  
  Create a SQL Table From a DataFrame
&lt;/h2&gt;

&lt;p&gt;For our first trick, let's create a SQL table from data in a CSV. I downloaded a CSV containing &lt;a href="https://www.kaggle.com/new-york-city/new-york-city-current-job-postings"&gt;NYC job data&lt;/a&gt; which I'll be using to demonstrate:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--7kJKFrvM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/3032/1%2AwwcTa6orjchtI58kk_bFEg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--7kJKFrvM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://miro.medium.com/max/3032/1%2AwwcTa6orjchtI58kk_bFEg.png" alt="NYC Jobs data"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We're going to create a DataFrame from this CSV, as we've done a million times before:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;


&lt;span class="n"&gt;jobs_df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'data/nyc-jobs.csv'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Read and format data from CSV.





&lt;p&gt;We now have a DataFrame ready to be saved as a SQL table! We can accomplish this with a single method built in to all DataFrames called &lt;code&gt;to_sql()&lt;/code&gt;. As the name suggests, &lt;code&gt;to_sql()&lt;/code&gt; allows us to upload our DataFrame to a SQL database as a SQL table. Let's see it in action:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;sqlalchemy.types&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Integer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;DateTime&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="n"&gt;table_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;'nyc_jobs'&lt;/span&gt;

&lt;span class="n"&gt;jobs_df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to_sql&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;table_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;if_exists&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'replace'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;chunksize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;dtype&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="s"&gt;"job_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Integer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;"agency"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;"business_title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;"job_category"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;"salary_range_from"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Integer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;"salary_range_to"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Integer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;"salary_frequency"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="s"&gt;"work_location"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;"division/work_unit"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;"job_description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;"posting_date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;DateTime&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;"posting_updated"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;DateTime&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Save DataFrame to SQL table.





&lt;p&gt;There's quite a bit happening here! &lt;code&gt;to_sql()&lt;/code&gt; attempts to create a table with the name &lt;strong&gt;nyc_jobs&lt;/strong&gt; in the database associated with &lt;strong&gt;engine&lt;/strong&gt;. These two positional arguments are technically the only required parameters we need to pass, but it's a very good idea to take advantage of Pandas' ability to be more specific in table creation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;if_exists&lt;/code&gt;: This argument specifies what to do in the situation where a database table with the name &lt;strong&gt;nyc_jobs&lt;/strong&gt; already exists in the database. By default, Pandas will throw an error, which isn't very useful unless we only care about creating this table the first time. Passing &lt;strong&gt;replace&lt;/strong&gt; to this argument will drop the existing table and replace it with the data &amp;amp; data types associated with the current DataFrame. &lt;strong&gt;append&lt;/strong&gt; will keep the existing table the same, but append all rows in the DataFrame to the existing table.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;schema&lt;/code&gt;: Accepts the name of the Postgres schema to save your table in.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;index&lt;/code&gt;: When &lt;strong&gt;True&lt;/strong&gt; , the resulting table will honor your DataFrame's index to create a column with the appropriate key in your database.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;chunksize&lt;/code&gt;: Passing a number to this parameter will attempt to upload your data as a stream of "chunks" &lt;em&gt;n&lt;/em&gt; rows at a time, as opposed to all at once. Passing a chunksize is useful for particularly large datasets which may be at risk of interruption during upload.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;dtype&lt;/code&gt;: Passing a Python dictionary to &lt;strong&gt;dtype&lt;/strong&gt; lets us explicitly set the datatypes of each column in our database, where each &lt;strong&gt;key&lt;/strong&gt; is the &lt;em&gt;column name&lt;/em&gt; and each &lt;strong&gt;value&lt;/strong&gt; is the &lt;em&gt;data type&lt;/em&gt; (I &lt;em&gt;highly&lt;/em&gt; recommend doing this). You'll notice we import various data types from &lt;strong&gt;sqlalchemy.types&lt;/strong&gt; , which we then associate with each column's name. If the target SQL table doesn't exist yet, passing these datatypes will ensure that each SQL column is created with the appropriate data constraint, as opposed to each column rendered simply as "text." If a target SQL table &lt;em&gt;does&lt;/em&gt; exist, these data types &lt;em&gt;must&lt;/em&gt; match the types of the existing table, or you'll receive a SQL error during the upload.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Since we set SQLAlchemy's &lt;code&gt;echo&lt;/code&gt; parameter to &lt;code&gt;True&lt;/code&gt;, I'm able to see exactly what my database does with this DataFrame:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;2020-06-11 23:49:21,082 INFO sqlalchemy.engine.base.Engine SHOW VARIABLES LIKE &lt;span class="s1"&gt;'sql_mode'&lt;/span&gt;
2020-06-11 23:49:21,082 INFO sqlalchemy.engine.base.Engine &lt;span class="o"&gt;{}&lt;/span&gt;
2020-06-11 23:49:21,396 INFO sqlalchemy.engine.base.Engine SHOW VARIABLES LIKE &lt;span class="s1"&gt;'lower_case_table_names'&lt;/span&gt;
2020-06-11 23:49:21,396 INFO sqlalchemy.engine.base.Engine &lt;span class="o"&gt;{}&lt;/span&gt;
2020-06-11 23:49:21,432 INFO sqlalchemy.engine.base.Engine SELECT DATABASE&lt;span class="o"&gt;()&lt;/span&gt;
2020-06-11 23:49:21,432 INFO sqlalchemy.engine.base.Engine &lt;span class="o"&gt;{}&lt;/span&gt;
2020-06-11 23:49:21,470 INFO sqlalchemy.engine.base.Engine show collation where &lt;span class="s2"&gt;"Charset"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'utf8mb4'&lt;/span&gt; and &lt;span class="s2"&gt;"Collation"&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'utf8mb4_bin'&lt;/span&gt;
2020-06-11 23:49:21,470 INFO sqlalchemy.engine.base.Engine &lt;span class="o"&gt;{}&lt;/span&gt;
2020-06-11 23:49:21,502 INFO sqlalchemy.engine.base.Engine SELECT CAST&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'test plain returns'&lt;/span&gt; AS CHAR&lt;span class="o"&gt;(&lt;/span&gt;60&lt;span class="o"&gt;))&lt;/span&gt; AS anon_1
2020-06-11 23:49:21,502 INFO sqlalchemy.engine.base.Engine &lt;span class="o"&gt;{}&lt;/span&gt;
2020-06-11 23:49:21,523 INFO sqlalchemy.engine.base.Engine SELECT CAST&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'test unicode returns'&lt;/span&gt; AS CHAR&lt;span class="o"&gt;(&lt;/span&gt;60&lt;span class="o"&gt;))&lt;/span&gt; AS anon_1
2020-06-11 23:49:21,523 INFO sqlalchemy.engine.base.Engine &lt;span class="o"&gt;{}&lt;/span&gt;
2020-06-11 23:49:21,537 INFO sqlalchemy.engine.base.Engine SELECT CAST&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'test collated returns'&lt;/span&gt; AS CHAR CHARACTER SET utf8mb4&lt;span class="o"&gt;)&lt;/span&gt; COLLATE utf8mb4_bin AS anon_1
2020-06-11 23:49:21,537 INFO sqlalchemy.engine.base.Engine &lt;span class="o"&gt;{}&lt;/span&gt;
2020-06-11 23:49:21,587 INFO sqlalchemy.engine.base.Engine DESCRIBE &lt;span class="s2"&gt;"nyc_jobs"&lt;/span&gt;
2020-06-11 23:49:21,588 INFO sqlalchemy.engine.base.Engine &lt;span class="o"&gt;{}&lt;/span&gt;
2020-06-11 23:49:21,654 INFO sqlalchemy.engine.base.Engine ROLLBACK
2020-06-11 23:49:21,691 INFO sqlalchemy.engine.base.Engine 
CREATE TABLE nyc_jobs &lt;span class="o"&gt;(&lt;/span&gt;
    job_id INTEGER, 
    agency TEXT, 
    business_title TEXT, 
    job_category TEXT, 
    salary_range_from INTEGER, 
    salary_range_to INTEGER, 
    salary_frequency VARCHAR&lt;span class="o"&gt;(&lt;/span&gt;50&lt;span class="o"&gt;)&lt;/span&gt;, 
    work_location TEXT, 
    division TEXT, 
    job_description TEXT, 
    created_at DATETIME, 
    updated_at DATETIME
&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
SQLAlchemy creating a table from a Pandas DataFrame.





&lt;p&gt;Just as we described, our database uses  &lt;code&gt;CREATE TABLE nyc_jobs&lt;/code&gt; to create a new SQL table, with all columns assigned appropriate data types.&lt;/p&gt;

&lt;h2&gt;
  
  
  Create DataFrame from SQL Table
&lt;/h2&gt;

&lt;p&gt;Loading data from a database into a Pandas DataFrame is surprisingly easy. To load an entire table, use the &lt;code&gt;read_sql_table()&lt;/code&gt; method:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;table_df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read_sql_table&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;table_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;con&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;engine&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Convert SQL table to Pandas DataFrame





&lt;p&gt;The first two parameters we pass are the same as last time: first is our table name, and then our SQLAlchemy engine. The above snippet is perhaps the quickest and simplest way to translate a SQL table into a Pandas DataFrame, with essentially no configuration needed! Interestingly, Pandas is still oblivious to the dtype of each column we've pulled &lt;em&gt;despite having pulled from a database&lt;/em&gt;, as we can see with &lt;code&gt;print(table_df.info())&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&amp;lt;class &lt;span class="s1"&gt;'pandas.core.frame.DataFrame'&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
RangeIndex: 3123 entries, 0 to 3122
Data columns &lt;span class="o"&gt;(&lt;/span&gt;total 12 columns&lt;span class="o"&gt;)&lt;/span&gt;:
 &lt;span class="c"&gt;# Column Non-Null Count Dtype &lt;/span&gt;
&lt;span class="nt"&gt;---&lt;/span&gt; &lt;span class="nt"&gt;------&lt;/span&gt; &lt;span class="nt"&gt;--------------&lt;/span&gt; &lt;span class="nt"&gt;-----&lt;/span&gt; 
 0 job_id 3123 non-null int64 
 1 agency 3123 non-null object
 2 business_title 3123 non-null object
 3 job_category 3121 non-null object
 4 salary_range_from 3123 non-null int64 
 5 salary_range_to 3123 non-null int64 
 6 salary_frequency 3123 non-null object
 7 work_location 3123 non-null object
 8 division 3123 non-null object
 9 job_description 3123 non-null object
 10 created_at 3123 non-null object
 11 updated_at 3123 non-null object
dtypes: int64&lt;span class="o"&gt;(&lt;/span&gt;3&lt;span class="o"&gt;)&lt;/span&gt;, object&lt;span class="o"&gt;(&lt;/span&gt;9&lt;span class="o"&gt;)&lt;/span&gt;
memory usage: 292.9+ KB
None
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
SQLAlchemy creating a table from a Pandas DataFrame.





&lt;p&gt;The &lt;code&gt;read_sql_table()&lt;/code&gt; method can accept far more arguments than the two we passed. Here's an example where we read a SQL table and force some explicit things to happen:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="n"&gt;table_df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read_sql_table&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"nyc_jobs"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;con&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;schema&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'public'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;index_col&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'job_id'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;coerce_float&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;columns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s"&gt;'job_id'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;'business_title'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;'job_category'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;'posting_date'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;'posting_updated'&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;parse_dates&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s"&gt;'created_at'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;'updated_at'&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;chunksize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Loading a SQL table with explicit values.





&lt;p&gt;Some arguments should look familiar from when we ran &lt;code&gt;to_sql()&lt;/code&gt; earlier. &lt;strong&gt;schema&lt;/strong&gt; and &lt;strong&gt;chunksize&lt;/strong&gt; have the same meanings as they did previously. We also have a few new arguments as well:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;index_col&lt;/code&gt;: We can select any column of our SQL table to become an index in our Pandas DataFrame, regardless of whether or not the column is an index in SQL. We can pass the name of a single column as a string, or a list of strings representing the names of multiple columns.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;coerce_float&lt;/code&gt;: When set to &lt;strong&gt;True&lt;/strong&gt; , Pandas will look at columns containing numbers and attempt to convert these columns to floating point numbers. This attribute is set to &lt;strong&gt;True&lt;/strong&gt; by default.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;columns&lt;/code&gt;: Passing a list of column names to this attribute will create a DataFrame from &lt;em&gt;only&lt;/em&gt; the columns we provide (similar to a SQL select on x columns).&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;parse_dates&lt;/code&gt;: When moving data into Pandas we need to explicitly state which columns should be considered DateTime columns.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Create DataFrames From Query Results
&lt;/h3&gt;

&lt;p&gt;There will probably be times where you're just looking for a subset of data in a table as opposed to the entire table. In this scenario we can use &lt;code&gt;read_sql()&lt;/code&gt;, which creates a DataFrame from the results of a SQL query you run on a table:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="n"&gt;sql_df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read_sql&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"SELECT * FROM nyc_jobs"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;con&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;parse_dates&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s"&gt;'created_at'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;'updated_at'&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
SQL query to Pandas DataFrame.





&lt;p&gt;This time around our first parameter is a SQL query instead of the name of a table. We can modify this query to select only specific columns, rows which match criteria, or anything else you can do with SQL.&lt;/p&gt;

&lt;p&gt;That's all folks! If you're interested, the source is up on Github here:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/hackersandslackers/pandas-sqlalchemy-tutorial"&gt;https://github.com/hackersandslackers/pandas-sqlalchemy-tutorial&lt;/a&gt;&lt;/p&gt;

</description>
      <category>pandas</category>
      <category>python</category>
      <category>dataanalysis</category>
      <category>sql</category>
    </item>
    <item>
      <title>Scraping Data on the Web with BeautifulSoup</title>
      <dc:creator>Todd Birchard</dc:creator>
      <pubDate>Thu, 11 Jun 2020 22:40:09 +0000</pubDate>
      <link>https://dev.to/hackersandslackers/scraping-data-on-the-web-with-beautifulsoup-11fl</link>
      <guid>https://dev.to/hackersandslackers/scraping-data-on-the-web-with-beautifulsoup-11fl</guid>
      <description>&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--nrs-MLEB--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2020/06/beautifulsoup-1-1.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--nrs-MLEB--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2020/06/beautifulsoup-1-1.jpg" alt="Scraping Data on the Web with BeautifulSoup"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Whether it be &lt;a href="https://www.kaggle.com/"&gt;Kaggle&lt;/a&gt;, &lt;a href="https://console.cloud.google.com/bigquery"&gt;Google Cloud&lt;/a&gt;, or the &lt;a href="https://www.data.gov/"&gt;federal government&lt;/a&gt;, there's plenty of reliable open-sourced data on the web. While there are plenty of reasons to hate being alive in our current chapter of humanity, open data is one of the few redeeming qualities of life on Earth today. But what is the opposite of "open" data, anyway?&lt;/p&gt;

&lt;p&gt;Like anything free and easily accessible, the only data inherently worth anything is either harvested privately or stolen from sources that would prefer you didn't. This is the sort of data business models can be built around, as social media platforms such as &lt;a href="https://techcrunch.com/2018/11/24/linkedin-ireland-data-protection/"&gt;LinkedIn&lt;/a&gt; have shown us as our personal information is &lt;a href="https://securityboulevard.com/2018/06/data-brokers-you-are-being-packaged-and-sold/"&gt;bought and sold by data brokers&lt;/a&gt;. These companies &lt;a href="https://techcrunch.com/2016/08/15/linkedin-sues-scrapers/"&gt;attempted to sue individual programmers&lt;/a&gt; like ourselves for scraping the data they collected via the same means, and epically lost in a court of law:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.forbes.com/sites/emmawoollacott/2019/09/10/linkedin-data-scraping-ruled-legal/"&gt;https://www.forbes.com/sites/emmawoollacott/2019/09/10/linkedin-data-scraping-ruled-legal/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The topic of scraping data on the web tends to raise questions about the ethics and legality of scraping, to which I plea: &lt;em&gt;don't hold back&lt;/em&gt;. If you aren't personally disgusted by the prospect of your life being transcribed, sold, and frequently leaked, the court system has ruled that you legally have a right to scrape data. The name of this publication is not &lt;strong&gt;People Who Play It Safe And Slackers&lt;/strong&gt;. We're a home for those who fight to take power back, and we're going to scrape the shit out of you.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tools for the Job
&lt;/h2&gt;

&lt;p&gt;Web scraping in Python is dominated by three major libraries: &lt;a href="https://www.crummy.com/software/BeautifulSoup/bs4/doc/"&gt;&lt;strong&gt;BeautifulSoup&lt;/strong&gt;&lt;/a&gt;, &lt;a href="https://scrapy.org/"&gt;&lt;strong&gt;Scrapy&lt;/strong&gt;&lt;/a&gt;, and &lt;a href="https://selenium-python.readthedocs.io/"&gt;&lt;strong&gt;Selenium&lt;/strong&gt;&lt;/a&gt;. Each of these libraries intends to solve for very different use cases. Thus it's essential to understand what we're choosing and why.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;BeautifulSoup&lt;/strong&gt; is one of the most prolific Python libraries in existence, in some part having shaped the web as we know it. BeautifulSoup is a lightweight, easy-to-learn, and highly effective way to programmatically isolate information on a single webpage at a time. It's common to use BeautifulSoupin conjunction with the &lt;strong&gt;requests&lt;/strong&gt; library, where &lt;em&gt;requests&lt;/em&gt; will fetch a page, and &lt;em&gt;BeautifulSoup&lt;/em&gt; will extract the resulting data.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scrapy&lt;/strong&gt; has an agenda much closer to mass pillaging than BeautifulSoup. Scrapy is a tool for building crawlers: these are absolute monstrosities unleashed upon the web like a swarm, loosely following links, and haste-fully grabbing data where data exists to be grabbed. Because Scrapy serves the purpose of mass-scraping, it is much easier to get in trouble with Scrapy.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Selenium&lt;/strong&gt; isn't exclusively a scraping tool as much as an automation tool that can be used to scrape sites. Selenium is the nuclear option for attempting to navigate sites programmatically, and should be treated as such: there are much better options for simple data extraction.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We'll be using BeautifulSoup, which should genuinely be anybody's default choice until the circumstances ask for more. BeautifulSoup is more than enough to steal data.&lt;/p&gt;

&lt;h2&gt;
  
  
  Preparing Our Extraction
&lt;/h2&gt;

&lt;p&gt;Before we steal any data, we need to set the stage. We'll start by installing our two libraries of choice:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;pip3 &lt;span class="nb"&gt;install &lt;/span&gt;beautifulsoup4 requests
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Install &lt;strong&gt;beautifulsoup&lt;/strong&gt; and &lt;strong&gt;requests&lt;/strong&gt;.





&lt;p&gt;As mentioned before, &lt;strong&gt;requests&lt;/strong&gt; will provide us with our target's HTML, and &lt;strong&gt;beautifulsoup4&lt;/strong&gt; will parse that data.&lt;/p&gt;

&lt;p&gt;We need to recognize that a lot of sites have precautions to fend off scrapers from accessing their data. The first thing we can do to get around this is spoofing the headers we send along with our requests to make our scraper look like a legitimate browser:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;requests&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;bs4&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BeautifulSoup&lt;/span&gt;

&lt;span class="n"&gt;headers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="s"&gt;'Access-Control-Allow-Origin'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'*'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;'Access-Control-Allow-Methods'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'GET'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;'Access-Control-Allow-Headers'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'Content-Type'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;'Access-Control-Max-Age'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'3600'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;'User-Agent'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0'&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Set request headers.





&lt;p&gt;This is only a first line of defense (or offensive, in our case). There are plenty of ways sites can still keep us at bay, but setting headers works shockingly well to fix most issues.&lt;/p&gt;

&lt;p&gt;Now let's fetch a page and inspect it with BeautifulSoup:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;requests&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;bs4&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BeautifulSoup&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"http://example.com"&lt;/span&gt;
&lt;span class="n"&gt;req&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;soup&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;BeautifulSoup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'html.parser'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;soup&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;prettify&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Scrape example.com





&lt;p&gt;We set things up by making a request to &lt;a href="http://example.com"&gt;http://example.com&lt;/a&gt;. We then create a BeautifulSoup object which accepts the raw content of that response via &lt;code&gt;req.content&lt;/code&gt;. The second parameter, &lt;code&gt;'html.parser'&lt;/code&gt;, is our way of telling BeautifulSoup that this is an HTML document. There are other parsers available for parsing stuff like XML, if you're into that.&lt;/p&gt;

&lt;p&gt;When we create a BeautifulSoup object from a page's HTML, our object contains the HTML structure of that page, which can now be easily parsed by all sorts of methods. First, let's see what our variable &lt;code&gt;soup&lt;/code&gt; looks like by using &lt;code&gt;print(soup.prettify())&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;html&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;"gr__example_com"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&amp;lt;head&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;title&amp;gt;&lt;/span&gt;Example Domain&lt;span class="nt"&gt;&amp;lt;/title&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;meta&lt;/span&gt; &lt;span class="na"&gt;charset=&lt;/span&gt;&lt;span class="s"&gt;"utf-8"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;meta&lt;/span&gt; &lt;span class="na"&gt;http-equiv=&lt;/span&gt;&lt;span class="s"&gt;"Content-type"&lt;/span&gt; &lt;span class="na"&gt;content=&lt;/span&gt;&lt;span class="s"&gt;"text/html; charset=utf-8"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;meta&lt;/span&gt; &lt;span class="na"&gt;name=&lt;/span&gt;&lt;span class="s"&gt;"viewport"&lt;/span&gt; &lt;span class="na"&gt;content=&lt;/span&gt;&lt;span class="s"&gt;"width=device-width, initial-scale=1"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;meta&lt;/span&gt; &lt;span class="na"&gt;property=&lt;/span&gt;&lt;span class="s"&gt;"og:site_name"&lt;/span&gt; &lt;span class="na"&gt;content=&lt;/span&gt;&lt;span class="s"&gt;"Example dot com"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;meta&lt;/span&gt; &lt;span class="na"&gt;property=&lt;/span&gt;&lt;span class="s"&gt;"og:type"&lt;/span&gt; &lt;span class="na"&gt;content=&lt;/span&gt;&lt;span class="s"&gt;"website"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;meta&lt;/span&gt; &lt;span class="na"&gt;property=&lt;/span&gt;&lt;span class="s"&gt;"og:title"&lt;/span&gt; &lt;span class="na"&gt;content=&lt;/span&gt;&lt;span class="s"&gt;"Example"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;meta&lt;/span&gt; &lt;span class="na"&gt;property=&lt;/span&gt;&lt;span class="s"&gt;"og:description"&lt;/span&gt; &lt;span class="na"&gt;content=&lt;/span&gt;&lt;span class="s"&gt;"An Example website."&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;meta&lt;/span&gt; &lt;span class="na"&gt;property=&lt;/span&gt;&lt;span class="s"&gt;"og:image"&lt;/span&gt; &lt;span class="na"&gt;content=&lt;/span&gt;&lt;span class="s"&gt;"http://example.com/img/image.jpg"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;meta&lt;/span&gt; &lt;span class="na"&gt;name=&lt;/span&gt;&lt;span class="s"&gt;"twitter:title"&lt;/span&gt; &lt;span class="na"&gt;content=&lt;/span&gt;&lt;span class="s"&gt;"Hackers and Slackers"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;meta&lt;/span&gt; &lt;span class="na"&gt;name=&lt;/span&gt;&lt;span class="s"&gt;"twitter:description"&lt;/span&gt; &lt;span class="na"&gt;content=&lt;/span&gt;&lt;span class="s"&gt;"An Example website."&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;meta&lt;/span&gt; &lt;span class="na"&gt;name=&lt;/span&gt;&lt;span class="s"&gt;"twitter:url"&lt;/span&gt; &lt;span class="na"&gt;content=&lt;/span&gt;&lt;span class="s"&gt;"http://example.com/"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;meta&lt;/span&gt; &lt;span class="na"&gt;name=&lt;/span&gt;&lt;span class="s"&gt;"twitter:image"&lt;/span&gt; &lt;span class="na"&gt;content=&lt;/span&gt;&lt;span class="s"&gt;"http://example.com/img/image.jpg"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/head&amp;gt;&lt;/span&gt;

&lt;span class="nt"&gt;&amp;lt;body&lt;/span&gt; &lt;span class="na"&gt;data-gr-c-s-loaded=&lt;/span&gt;&lt;span class="s"&gt;"true"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;div&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;h1&amp;gt;&lt;/span&gt;Example Domain&lt;span class="nt"&gt;&amp;lt;/h1&amp;gt;&lt;/span&gt;
      &lt;span class="nt"&gt;&amp;lt;p&amp;gt;&lt;/span&gt;This domain is established to be used for illustrative examples in documents.&lt;span class="nt"&gt;&amp;lt;/p&amp;gt;&lt;/span&gt;
      &lt;span class="nt"&gt;&amp;lt;p&amp;gt;&lt;/span&gt;You may use this domain in examples without prior coordination or asking for permission.&lt;span class="nt"&gt;&amp;lt;/p&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;p&amp;gt;&amp;lt;a&lt;/span&gt; &lt;span class="na"&gt;href=&lt;/span&gt;&lt;span class="s"&gt;"http://www.iana.org/domains/example"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;More information...&lt;span class="nt"&gt;&amp;lt;/a&amp;gt;&amp;lt;/p&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;/div&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/body&amp;gt;&lt;/span&gt;

&lt;span class="nt"&gt;&amp;lt;/html&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
HTML for example.com





&lt;h2&gt;
  
  
  Targeting HTML Elements
&lt;/h2&gt;

&lt;p&gt;There are many methods available to us for pinpointing and grabbing the information we're trying to get out of a page. Finding the &lt;em&gt;exact&lt;/em&gt; information we want out of a web page is a bit of an art form: effective scraping requires us to recognize patterns in document's HTML that we can take advantage of to ensure we only grab the pieces we need. This is especially the case when dealing with sites that actively try to prevent us from doing just that.&lt;/p&gt;

&lt;p&gt;Understanding the tools we have at our disposal is the first step to developing a keen eye for what's possible. We'll start with the meat and potatoes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Using find() &amp;amp; find_all()
&lt;/h3&gt;

&lt;p&gt;The most straightforward way to finding information in our &lt;code&gt;soup&lt;/code&gt; variable is by utilizing &lt;code&gt;soup.find(...)&lt;/code&gt; or &lt;code&gt;soup.find_all(...)&lt;/code&gt;. These two methods work the same with one exception: &lt;strong&gt;find&lt;/strong&gt; returns the first HTML element found, whereas &lt;strong&gt;find_all&lt;/strong&gt; returns a list of all elements matching the criteria (even if only one element is found, &lt;strong&gt;find_all&lt;/strong&gt; will return a list of a single item).&lt;/p&gt;

&lt;p&gt;We can search for DOM elements in our &lt;code&gt;soup&lt;/code&gt; variable by searching for certain criteria. Passing a positional argument to &lt;strong&gt;find_all&lt;/strong&gt; will return &lt;em&gt;all&lt;/em&gt; anchor tags on the site:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;soup&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find_all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"a"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# &amp;lt;a href="http://example.com/elsie" class="boy" id="link1"&amp;gt;Elsie&amp;lt;/a&amp;gt;
# &amp;lt;a href="http://example.com/lacie" class="boy" id="link2"&amp;gt;Lacie&amp;lt;/a&amp;gt; 
# &amp;lt;a href="http://example.com/tillie" class="girl" id="link3"&amp;gt;Tillie&amp;lt;/a&amp;gt;
&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Find all &lt;code&gt;&amp;lt;a&amp;gt;&lt;/code&gt; tags.





&lt;p&gt;We can also find all anchor tags which have the class name &lt;em&gt;"boy"&lt;/em&gt;. Passing the &lt;code&gt;class_&lt;/code&gt; argument allows us to filter by class name. Note the underscore!&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;soup&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find_all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"a"&lt;/span&gt; &lt;span class="n"&gt;class_&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"boy"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# &amp;lt;a href="http://example.com/elsie" class="boy" id="link1"&amp;gt;Elsie&amp;lt;/a&amp;gt;
# &amp;lt;a href="http://example.com/lacie" class="boy" id="link2"&amp;gt;Lacie&amp;lt;/a&amp;gt; 
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Find all &lt;code&gt;&amp;lt;a&amp;gt;&lt;/code&gt; tags assigned a certain class.





&lt;p&gt;If we wanted to get &lt;em&gt;any&lt;/em&gt; element with the class name &lt;em&gt;"boy"&lt;/em&gt; besides anchor tags, we can do that too:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;soup&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find_all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;class_&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"boy"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# &amp;lt;a href="http://example.com/elsie" class="boy" id="link1"&amp;gt;Elsie&amp;lt;/a&amp;gt;
# &amp;lt;a href="http://example.com/lacie" class="boy" id="link2"&amp;gt;Lacie&amp;lt;/a&amp;gt; 
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Find all elements assigned a certain class.





&lt;p&gt;We can search for elements by id in the same way we searched for classes. Remember that we should only expect a single element to be returned with an id, so we should use &lt;strong&gt;find&lt;/strong&gt; here:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;soup&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"a"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"link1"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# &amp;lt;a href="http://example.com/elsie" class="boy" id="link1"&amp;gt;Elsie&amp;lt;/a&amp;gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Find element by ID.





&lt;p&gt;Often times we'll run into situations where elements don't have reliable class or id values. Luckily we can search for DOM elements with any attribute, including non-standard ones:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;soup&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find_all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;attrs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;"data-args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"bologna"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Find elements by attribute.





&lt;h3&gt;
  
  
  CSS Selectors
&lt;/h3&gt;

&lt;p&gt;Searching HTML using CSS selectors is one of the most powerful ways to find what you're looking for, especially for sites trying to make your life difficult. Using CSS selectors enables us to find and leverage highly-specific patterns in the target's DOM structure. This is the best way to ensure we're grabbing &lt;em&gt;exactly&lt;/em&gt; the content we need. If you're rusty on CSS selectors, I highly recommend becoming reacquainted. Here are a few examples:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;soup&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;select&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".widget.author p"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Find elements via CSS selector syntax.





&lt;p&gt;In this example, we're looking for an element that has a "widget" class, as well as an "author" class. Once we have that element, we go deeper to find any paragraph tags held within that widget. We could also modify this to get only the &lt;em&gt;second&lt;/em&gt; paragraph tag inside the author widget:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;soup&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;select&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;".widget.author p:nth-of-type(2)"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;To understand why this is so powerful, imagine a site that intentionally has no identifying attributes on its tags to keep people like you from scraping their data. Even without names to select by, we could observe the DOM structure of the page and find a unique way to navigate to the element we want:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;soup&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;select&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"body &amp;gt; div:first-of-type &amp;gt; div &amp;gt; ul li"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;A specific pattern like this is  likely unique to only a single collection of &lt;code&gt;&amp;lt;li&amp;gt;&lt;/code&gt; tags on the page we're exploiting. The downside of this method is we're at the whim of the site owner, as their HTML structure could change.&lt;/p&gt;

&lt;h2&gt;
  
  
  Get Some Attributes
&lt;/h2&gt;

&lt;p&gt;Chances are we'll almost always want the contents or the attributes of a tag, as opposed to the entirety of a tag's HTML. If we're scraping anchor tags, for instance, we probably just want the &lt;code&gt;href&lt;/code&gt; value, as opposed to the entire tag. The &lt;code&gt;.get&lt;/code&gt; method can be used here to retrieve values of attributes on a tag:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;soup&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find_all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'a'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'href'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Get elements and extract attribute values.





&lt;p&gt;The above finds the destination URLs for all &lt;code&gt;&amp;lt;a&amp;gt;&lt;/code&gt; tags on a page. Another example can have us grab a site's logo image:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;soup.find(id="logo").get('src') 
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Sometimes it's not attributes we're looking for, but just the text within a tag:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;soup&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'p'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;get_text&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Get elements and extract text content.





&lt;h3&gt;
  
  
  Pesky Tags to Deal With
&lt;/h3&gt;

&lt;p&gt;In our example of creating link previews, a good first source of information would obviously be the page's meta tags: specifically the &lt;code&gt;og&lt;/code&gt; tags they've specified to openly provide the bite-sized information we're looking for. Grabbing these tags are a bit more difficult to deal with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;soup&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"og:description"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'content'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Now that's ugly. Meta tags are an especially interesting case; they're all uselessly dubbed 'meta', thus we need a second identifier (in addition to the tag name) to specify &lt;em&gt;which&lt;/em&gt; meta tag we care about. Only then can we bother to &lt;em&gt;get&lt;/em&gt; the actual content of said tag.&lt;/p&gt;

&lt;h2&gt;
  
  
  Realizing Something Will Always Break
&lt;/h2&gt;

&lt;p&gt;If we were to try the above selector on an HTML page that did not contain an &lt;code&gt;og:description&lt;/code&gt;, our script would break unforgivingly. Not only do we miss this data, but we miss out on everything entirely - this means we always need to build in a plan B, and at the very least deal with a lack of tag altogether.&lt;/p&gt;

&lt;p&gt;It's best to break out this logic one tag at a time. First, let's look at an example for a base scraper with all the knowledge we have so far:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;requests&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;bs4&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BeautifulSoup&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;scrape_page_metadata&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="s"&gt;"""Scrape target URL for metadata."""&lt;/span&gt;
    &lt;span class="n"&gt;headers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="s"&gt;'Access-Control-Allow-Origin'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'*'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;'Access-Control-Allow-Methods'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'GET'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;'Access-Control-Allow-Headers'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'Content-Type'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;'Access-Control-Max-Age'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'3600'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;'User-Agent'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0'&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;pp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pprint&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PrettyPrinter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;indent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;html&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;BeautifulSoup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'html.parser'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;metadata&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="s"&gt;'title'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;get_title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="s"&gt;'description'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;get_description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="s"&gt;'image'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;get_image&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="s"&gt;'favicon'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;get_favicon&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="s"&gt;'sitename'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;get_site_name&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="s"&gt;'color'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;get_theme_color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="s"&gt;'url'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;pp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pprint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
main.py





&lt;p&gt;This function lays the foundation for snatching a given URL's metadata. The result we're looking for is a dictionary named &lt;code&gt;metadata&lt;/code&gt;, which contains the data we manage to scrape successfully.&lt;/p&gt;

&lt;p&gt;Each key in our dictionary has a corresponding function which attempts to scrape the corresponding information. Here's what we have for fetching a page's &lt;strong&gt;title&lt;/strong&gt; , &lt;strong&gt;description&lt;/strong&gt; , and &lt;strong&gt;social image&lt;/strong&gt; values:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="s"&gt;"""Scrape page title."""&lt;/span&gt;
    &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"og:title"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"og:title"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'content'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"twitter:title"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"twitter:title"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'content'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"h1"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"h1"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="s"&gt;"""Scrape page description."""&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'content'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"og:description"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"og:description"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'content'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"twitter:description"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"twitter:description"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'content'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"p"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"p"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;contents&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_image&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="s"&gt;"""Scrape share image."""&lt;/span&gt;
    &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"image"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"image"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'content'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"og:image"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"og:image"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'content'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"twitter:image"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"twitter:image"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'content'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"img"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;src&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find_all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"img"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'src'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
main.py





&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;get_title&lt;/strong&gt; tries to get the &lt;code&gt;&amp;lt;title&amp;gt;&lt;/code&gt; tag, which has a very low chance of failing. Just in case the target page actually &lt;em&gt;is&lt;/em&gt; missing this tag, we fall back to Facebook and Twitter meta tags. If &lt;em&gt;all of this still fails&lt;/em&gt;, we finally resort to trying to pull the first &lt;code&gt;&amp;lt;h1&amp;gt;&lt;/code&gt; tag on the page (if we get to this point, we're probably scraping a garbage site).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;get_description&lt;/strong&gt; is nearly identical to our method for scraping page titles. The last resort is a desperate attempt to pull the first paragraph on the page.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;get_image&lt;/strong&gt; looks for the page's "share" image, which is used to generate link previews on social media platforms. Our last resort is to pull the first &lt;code&gt;&amp;lt;img&amp;gt;&lt;/code&gt; tag containing a source image.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What Did We Build?
&lt;/h2&gt;

&lt;p&gt;This simple script we just threw together is the basis for how most services generate "link previews": an embedded widget containing a synopsis of a site before clicking in (think Facebook, Slack, Discord, etc.). There are even some services which charge monthly fees of ~$10/month to provide the service we've just built. Instead of paying for something like that, feel free to take my source code and use it as you please:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="s"&gt;"""Scrape metadata from target URL."""&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;requests&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;bs4&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BeautifulSoup&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;pprint&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;scrape_page_metadata&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="s"&gt;"""Scrape target URL for metadata."""&lt;/span&gt;
    &lt;span class="n"&gt;headers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="s"&gt;'Access-Control-Allow-Origin'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'*'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;'Access-Control-Allow-Methods'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'GET'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;'Access-Control-Allow-Headers'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'Content-Type'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;'Access-Control-Max-Age'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'3600'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;'User-Agent'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0'&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;pp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pprint&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PrettyPrinter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;indent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;html&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;BeautifulSoup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'html.parser'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;metadata&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="s"&gt;'title'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;get_title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="s"&gt;'description'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;get_description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="s"&gt;'image'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;get_image&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="s"&gt;'favicon'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;get_favicon&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="s"&gt;'sitename'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;get_site_name&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="s"&gt;'color'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;get_theme_color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="s"&gt;'url'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;pp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pprint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_title&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="s"&gt;"""Scrape page title."""&lt;/span&gt;
    &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"og:title"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"og:title"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'content'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"twitter:title"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"twitter:title"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'content'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"h1"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"h1"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="s"&gt;"""Scrape page description."""&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'content'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"og:description"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"og:description"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'content'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"twitter:description"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"twitter:description"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'content'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"p"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"p"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;contents&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_image&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="s"&gt;"""Scrape share image."""&lt;/span&gt;
    &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"image"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"image"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'content'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"og:image"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"og:image"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'content'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"twitter:image"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"twitter:image"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'content'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"img"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;src&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find_all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"img"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'src'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_site_name&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="s"&gt;"""Scrape site name."""&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"og:site_name"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;site_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"og:site_name"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'content'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'twitter:title'&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;site_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"twitter:title"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'content'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;site_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'//'&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;site_name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'/'&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;rsplit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'.'&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;capitalize&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;sitename&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_favicon&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="s"&gt;"""Scrape favicon."""&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"link"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attrs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;"rel"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"icon"&lt;/span&gt;&lt;span class="p"&gt;}):&lt;/span&gt;
        &lt;span class="n"&gt;favicon&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"link"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attrs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;"rel"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"icon"&lt;/span&gt;&lt;span class="p"&gt;}).&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'href'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"link"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attrs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;"rel"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"shortcut icon"&lt;/span&gt;&lt;span class="p"&gt;}):&lt;/span&gt;
        &lt;span class="n"&gt;favicon&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"link"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attrs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;"rel"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"shortcut icon"&lt;/span&gt;&lt;span class="p"&gt;}).&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'href'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;favicon&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;f'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rstrip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/favicon.ico'&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;favicon&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_theme_color&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="s"&gt;"""Scrape brand color."""&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"theme-color"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;color&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"meta"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;property&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"theme-color"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'content'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;color&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
main.py





&lt;p&gt;I've uploaded the source code for this tutorial to Github, which contains instructions on how to download and run this script yourself. Enjoy, and join us next time when we up the ante with more nefarious scraping tactics!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/hackersandslackers/beautifulsoup-tutorial"&gt;https://github.com/hackersandslackers/beautifulsoup-tutorial&lt;/a&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>scrapers</category>
      <category>data</category>
      <category>dataengineering</category>
    </item>
    <item>
      <title>Understanding GatsbyJS: Create Your First Gatsby Theme</title>
      <dc:creator>Todd Birchard</dc:creator>
      <pubDate>Wed, 10 Jun 2020 14:46:00 +0000</pubDate>
      <link>https://dev.to/hackersandslackers/understanding-gatsbyjs-create-your-first-gatsby-theme-3e6g</link>
      <guid>https://dev.to/hackersandslackers/understanding-gatsbyjs-create-your-first-gatsby-theme-3e6g</guid>
      <description>&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fhackersandslackers-cdn.storage.googleapis.com%2F2020%2F02%2Fgatsbyjs-intro.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fhackersandslackers-cdn.storage.googleapis.com%2F2020%2F02%2Fgatsbyjs-intro.jpg" alt="Understanding GatsbyJS: Create Your First Gatsby Theme"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I’m no stranger to broadcasting my thoughts, opinions, and occasional lack of knowledge across the eternal internet. That said, I &lt;em&gt;do&lt;/em&gt; pride myself on one thing as a shameless producer of mediocre content: I’ve never blogged about blogging, the state of blogs, or the act of creating blogs. Bloggers who blog about blogging carry the same lack of substance derived from rappers who rap about the act of rapping. Unfortunately for all of us, my untarnished record of blogging-about-blogging ends today.&lt;/p&gt;

&lt;p&gt;We recently rewrote the blog theme for &lt;a href="https://hackersandslackers.com" rel="noopener noreferrer"&gt;Hackers and Slackers&lt;/a&gt; in &lt;strong&gt;GatsbyJS&lt;/strong&gt; : arguably the sexiest option for generating static sites on the JAMStack.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why You're Probably Here
&lt;/h2&gt;

&lt;p&gt;You're not here to learn what a JAMStack is, why it’s beneficial, or why you should think they’re cool. There's plenty of well-written documentation on the topic, and there are &lt;em&gt;even more&lt;/em&gt; poorly written Medium articles that mostly copy &amp;amp; paste the former. Apologies for the grumpiness- I’ve been JAMing a bit too hard lately.&lt;/p&gt;

&lt;p&gt;I'm here to shed light on implementing a stack that's worked well for me: &lt;a href="https://ghost.org/" rel="noopener noreferrer"&gt;&lt;strong&gt;Ghost&lt;/strong&gt;&lt;/a&gt; as a CMS, &lt;a href="https://www.gatsbyjs.org/" rel="noopener noreferrer"&gt;&lt;strong&gt;GatsbyJS&lt;/strong&gt;&lt;/a&gt; as a static site generator, and &lt;a href="https://www.netlify.com/" rel="noopener noreferrer"&gt;&lt;strong&gt;Netlify&lt;/strong&gt;&lt;/a&gt; for deployment. This is an excellent combination of tools, but there's an absurdly frustrating lack of &lt;em&gt;centralized&lt;/em&gt; documentation on how these pieces fit together. Each of these services has excelled at delivering its portion of the pipeline. We're here to put the pieces together.&lt;/p&gt;

&lt;p&gt;This series is going to walk through how Gatsby generates static sites. To accomplish this, we're going to create our own Gatsby theme and walk through Gatsby's end-to-end build process. For the sake of this tutorial, we're going to assume you have basic knowledge of GraphQL and React.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Ghost as a Headless CMS?
&lt;/h3&gt;

&lt;p&gt;Netlify has effectively cornered the market as the de-facto host of Gatsby sites, which leaves our choice of CMS to be the most significant decision we need to make in our stack. In my opinion, Ghost is an attractive CMS option because of its philosophy of simplicity. The team behind Ghost has shown a respectable amount of restraint when it comes to adding bloated features and data types, which becomes especially important when managing the schema of a static site generator.&lt;/p&gt;

&lt;p&gt;When building a website with Gatsby, your site's structure is dictated by the relationships predetermined by your primary source of information. Our raw data implicitly makes fundamental decisions about our site's structure, such as what constitutes "page," or which attributes data models have, such as "tags."  Ghost provides us with what we'd expect from a CMS originally intended for blogs: we have &lt;em&gt;authors&lt;/em&gt; creating &lt;em&gt;pages/posts&lt;/em&gt; which contain &lt;em&gt;tags&lt;/em&gt;. It's what we need to build the structure of a static site.&lt;/p&gt;

&lt;p&gt;CMS options like &lt;a href="https://strapi.io/" rel="noopener noreferrer"&gt;Strapi&lt;/a&gt;, &lt;a href="https://prismic.io/" rel="noopener noreferrer"&gt;Prismic&lt;/a&gt;, and &lt;a href="https://www.contentful.com/" rel="noopener noreferrer"&gt;Contentful&lt;/a&gt; are fantastic in what they're able to achieve by abstracting content types. Strapi doesn't even assume the relationship between &lt;em&gt;pages&lt;/em&gt; and &lt;em&gt;authors&lt;/em&gt; unless you explicitly create those content types and define a many-to-many relationship between them. While this is extremely powerful, I've found that the power to change the fundamental data structure of a site is more dangerous than beneficial. Sometimes we need to protect us from ourselves. This is where Ghost comes in: aside from being a good CMS, Ghost allows us to build a site structure first and extend on it later.&lt;/p&gt;

&lt;h2&gt;
  
  
  Starting a GatsbyJS Theme
&lt;/h2&gt;

&lt;p&gt;First things first, we'll need to install the &lt;a href="https://www.npmjs.com/package/gatsby-cli" rel="noopener noreferrer"&gt;Gatsby CLI&lt;/a&gt;. The CLI allows us to create new Gatsby projects from the command line easily:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm i gatsby-cli &lt;span class="nt"&gt;-g&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Install Gatsby CLI globally.





&lt;p&gt;The best way to get started with Gatsby is by cloning one of the many &lt;a href="https://www.gatsbyjs.org/starters/?v=2" rel="noopener noreferrer"&gt;starter templates&lt;/a&gt; Gatsby has to offer and iterating on them to make the theme our own. Because we're using Ghost as our CMS, it makes the most sense to start with the &lt;a href="https://github.com/TryGhost/gatsby-starter-ghost" rel="noopener noreferrer"&gt;Ghost starter template&lt;/a&gt;. Gatsby-CLI makes it easy to create new Gatsby projects from existing ones on GitHub, like so:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gatsby new my-gatsby-project https://github.com/TryGhost/gatsby-starter-ghost.git
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Create new Gatsby project. 





&lt;p&gt;Running &lt;code&gt;gatsby new&lt;/code&gt; is essentially the equivalent of running &lt;code&gt;git clone&lt;/code&gt; and &lt;code&gt;npm install&lt;/code&gt; within the resulting folder. The only difference is &lt;code&gt;gatsby new&lt;/code&gt; will not retain a git remote, wheres &lt;code&gt;git clone&lt;/code&gt; would.&lt;/p&gt;

&lt;p&gt;We can already run our site locally to see what we've started:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;my-gatsby-project
&lt;span class="nv"&gt;$ &lt;/span&gt;gatsby develop
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Run Gatsby locally.





&lt;p&gt;The &lt;code&gt;gatsby develop&lt;/code&gt; command generates a static site in the directory of a Gatsby project in development mode. We can now preview our theme locally at &lt;code&gt;http://localhost:8000&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fstorage.googleapis.com%2Fhackersandslackers-cdn%2F2019%2F09%2FScreen-Shot-2019-09-22-at-5.46.47-AM.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fstorage.googleapis.com%2Fhackersandslackers-cdn%2F2019%2F09%2FScreen-Shot-2019-09-22-at-5.46.47-AM.png" alt="Understanding GatsbyJS: Create Your First Gatsby Theme"&gt;&lt;/a&gt;Starter theme deployed to &lt;a href="http://localhost:8000" rel="noopener noreferrer"&gt;http://localhost:8000&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now that we have a working Gatsby theme, we can begin to dissect how Gatsby works. Let's start by dealing with all this placeholder content.&lt;/p&gt;

&lt;p&gt;The Ghost Gatsby starter theme is configured to point to placeholder content by default. We can easily configure our theme to point to our own Ghost admin instead by changing the values in &lt;strong&gt;.ghost.json&lt;/strong&gt; :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"development"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"apiUrl"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://gatsby.ghost.io"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"contentApiKey"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"9cc5c67c358edfdd81455149d0"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"production"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"apiUrl"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://gatsby.ghost.io"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"contentApiKey"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"9cc5c67c358edfdd81455149d0"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
.ghost.json





&lt;p&gt;The config asks for two values: an &lt;code&gt;apiUrl&lt;/code&gt; and a &lt;code&gt;contentApiKey&lt;/code&gt;. These are referring to values you'll find in your own Ghost admin by creating an integration on the &lt;strong&gt;integrations&lt;/strong&gt; tab. Here's what mine looks like:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fstorage.googleapis.com%2Fhackersandslackers-cdn%2F2019%2F09%2FScreen-Shot-2019-09-22-at-5.56.28-AM.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fstorage.googleapis.com%2Fhackersandslackers-cdn%2F2019%2F09%2FScreen-Shot-2019-09-22-at-5.56.28-AM.png" alt="Understanding GatsbyJS: Create Your First Gatsby Theme"&gt;&lt;/a&gt;Ghost integration for sourcing content to Gatsby.&lt;/p&gt;

&lt;p&gt;Creating any integration will provide us with a &lt;strong&gt;Content API Key&lt;/strong&gt; and an &lt;strong&gt;API URL&lt;/strong&gt; , which are the two things we need for our config:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"development"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"apiUrl"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://hackersandslackers.app"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"contentApiKey"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"8a1becd7267fd71108c327c0f6"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"production"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"apiUrl"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://hackersandslackers.app"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"contentApiKey"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"8a1becd7267fd71108c327c0f6"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
.ghost.json





&lt;p&gt;Save this file and confirm that &lt;code&gt;http://localhost:8000&lt;/code&gt; now serves your content (if you left &lt;code&gt;gatsby develop&lt;/code&gt; running, the site should hot reload for you). Changes are that your content isn't going to immediately look great. This is what my abomination looks like:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fstorage.googleapis.com%2Fhackersandslackers-cdn%2F2019%2F09%2FScreen-Shot-2019-09-22-at-6.05.28-AM.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fstorage.googleapis.com%2Fhackersandslackers-cdn%2F2019%2F09%2FScreen-Shot-2019-09-22-at-6.05.28-AM.png" alt="Understanding GatsbyJS: Create Your First Gatsby Theme"&gt;&lt;/a&gt;&lt;strong&gt;gatsby-starter-ghost&lt;/strong&gt; sourcing Hackers and Slackers content.&lt;/p&gt;

&lt;p&gt;The content coming from my Ghost admin looks awful in a default theme, which shoudn't surprise us. We're going to need to make some changes to this theme.&lt;/p&gt;

&lt;h2&gt;
  
  
  Anatomy of a Gatsby Site
&lt;/h2&gt;

&lt;p&gt;Navigating a Gatsby theme for the first time is probably a bit overwhelming. There’s a lot of Gatsby-specific things we’ve never seen before (obviously), which might be challenging to dissect at first glance. Let’s see what we’ve got:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/my-gatsby-project
├── /src
│ ├── /components
│ ├── /pages
│ ├── /styles
│ ├── /templates
│ └── /utils
├── /static
│ ├── /images
│ └── /fonts
├── /public
├── /node_modules
├── package.json
├── package-lock.json
├── .ghost.json
├── netlify.toml
├── gatsby-node.js
├── gatsby-config.js
└── gatsby-browser.js
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
/my-gatsby-project





&lt;p&gt;Gatsby's purpose is to take raw data from sources (like our Ghost admin), use that data to inform our site's structure, and finally transform our data to produce a site mostly comprised of static HTML and CSS. All of these static pages, styles, and assets live in the &lt;strong&gt;public&lt;/strong&gt; folder. You should never need to work within this folder, as it's output will change with every build.&lt;/p&gt;

&lt;h3&gt;
  
  
  Building to the Public Folder
&lt;/h3&gt;

&lt;p&gt;The simplicity of static sites enables them to have speedy load times. Static pages don’t need to think about which widgets or navigation items to display each time a user loads a page. They don’t need to rely on frontend JavaScript to manipulate pages. Most impressive of all, this &lt;em&gt;particular&lt;/em&gt; breed of static site doesn’t need to wait  before loading the pages you’ll probably click on next. Because every static page has a finite number of links to other static pages, Gatsby can load pages before you click on them.&lt;/p&gt;

&lt;p&gt;We’re tossing the word “static” around a lot here, which sounds kind of like we're dealing with the types of shitty sites we made using Dreamweaver in the '90s. Those were the days where changing a single link meant changing that same link manually on 100 other pages. Perhaps you're a bit more modern and picturing a workflow more along the lines of Jekyll and GitHub pages. The default method of deploying a production Gatsby site is by using the &lt;code&gt;gatsby build&lt;/code&gt; command, which generates a site comprised of unintelligent markup and styles. That said, most Gatsby developers will hardly need to deploy using &lt;code&gt;gatsby build&lt;/code&gt; at all.&lt;/p&gt;

&lt;p&gt;The "A" in JAMstack stands for APIs. By setting up webhooks in our Ghost admin, we can trigger a &lt;code&gt;gatsby build&lt;/code&gt; job &lt;em&gt;every time we update content in our CMS&lt;/em&gt;. Most static sites are hosted on services like Netlify, which continuously listen to for changes to our content via webhooks and rebuild our website accordingly. Setting up such a webhook in Ghost is as easy as expanding on the &lt;strong&gt;integration&lt;/strong&gt; we created earlier. Here's what I use to automatically trigger builds to Netlify upon content updates in Ghost:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fstorage.googleapis.com%2Fhackersandslackers-cdn%2F2019%2F09%2FScreen-Shot-2019-09-22-at-6.22.42-AM.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fstorage.googleapis.com%2Fhackersandslackers-cdn%2F2019%2F09%2FScreen-Shot-2019-09-22-at-6.22.42-AM.png" alt="Understanding GatsbyJS: Create Your First Gatsby Theme"&gt;&lt;/a&gt;Ghost admin -&amp;gt; Integrations -&amp;gt; gatsby&lt;/p&gt;

&lt;p&gt;The reality of GatsbyJS and other site generators in the JAMStack is that they're hardly "static" at all. Even though the pages we serve to user clients are technically "static," a simple webhook has our Gatsby theme rebuilding itself over and over, remaking the contents of the &lt;strong&gt;public&lt;/strong&gt; folder from scratch each time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Templates, Pages, and Components
&lt;/h2&gt;

&lt;p&gt;So, the end result of &lt;code&gt;gatsby build&lt;/code&gt; or &lt;code&gt;gatsby develop&lt;/code&gt; is to generate the files which make up our site and dump them into the &lt;strong&gt;public&lt;/strong&gt; folder. After sourcing our own content and seeing the ugly result, it's clear that we're going to make some changes to our page layouts. The first logical move would be to make changes to the presentation layer, which is contained entirely in the &lt;strong&gt;src&lt;/strong&gt; folder.&lt;/p&gt;

&lt;p&gt;Gatsby's &lt;strong&gt;src&lt;/strong&gt; folder contains the logic for generating the HTML and stylesheets which ultimately make up the pages that get built. Each JavaScript file living in &lt;strong&gt;src&lt;/strong&gt; is essentially a React component. Each of these components output JSX as a result of their own GraphQL queries (or data passed in from &lt;em&gt;other&lt;/em&gt; components' GraphQL queries). Most of the time we spend customizing our theme will occur in the &lt;strong&gt;src&lt;/strong&gt; folder.&lt;/p&gt;

&lt;p&gt;Let's first concentrate on customizing a page &lt;strong&gt;template&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Templates
&lt;/h3&gt;

&lt;p&gt;A &lt;em&gt;template&lt;/em&gt; is a repeating page structure that will be used by multiple pages on our site. A perfect example of when to use a template would be blog posts. Blogs typically have thousands of "posts" in the sense of content, but each of these posts likely utilizes a single "post" template. All sites follow these types of patterns, thus our templates are going to determine the vast majority of what people see on our site.&lt;/p&gt;

&lt;p&gt;Here's a simple example of what a GatsbyJS blog post template looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;React&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;react&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;PropTypes&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;prop-types&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;graphql&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gatsby&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;Helmet&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;react-helmet&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Layout&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;../components/common&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;MetaData&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;../components/common/meta&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;Post&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;post&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ghostPost&lt;/span&gt;

  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;MetaData&lt;/span&gt;
          &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
          &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;post&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;title&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
          &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;post&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;excerpt&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
          &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"article"&lt;/span&gt;
        &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Layout&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"container"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;article&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
              &lt;span class="si"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;post&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;feature_image&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt;
                &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;figure&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"post-feature-image"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
                  &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;img&lt;/span&gt; &lt;span class="na"&gt;src&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;post&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;feature_image&lt;/span&gt; &lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="na"&gt;alt&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;post&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;title&lt;/span&gt; &lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
                &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;figure&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
              &lt;span class="si"&gt;}&lt;/span&gt;
              &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;section&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"post-full-content"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
                &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;h1&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"content-title"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;post&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;title&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;h1&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
                &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;main&lt;/span&gt;
                  &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"content-body load-external-scripts"&lt;/span&gt;
                  &lt;span class="na"&gt;dangerouslySetInnerHTML&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;__html&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;post&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;html&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
                &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
              &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;section&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;article&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Layout&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;/&amp;gt;&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;Post&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;propTypes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;PropTypes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;ghostPost&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;PropTypes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;PropTypes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;string&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;isRequired&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;html&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;PropTypes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;string&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;isRequired&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;feature_image&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;PropTypes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}).&lt;/span&gt;&lt;span class="nx"&gt;isRequired&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;}).&lt;/span&gt;&lt;span class="nx"&gt;isRequired&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;location&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;PropTypes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;isRequired&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="nx"&gt;Post&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;postQuery&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;graphql&lt;/span&gt;&lt;span class="s2"&gt;`
  query($slug: String!) {
    ghostPost(slug: { eq: $slug }) {
      title
      html
      feature_image
    }
  }
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
/src/templates/post.js





&lt;p&gt;Templates are comprised of three parts:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;GraphQL Data&lt;/strong&gt; : At the bottom of our template, we have a GraphQL query named &lt;code&gt;postQuery&lt;/code&gt;. This query speaks to the Ghost admin to grab post-specific information for the current page: the &lt;em&gt;title&lt;/em&gt;, &lt;em&gt;HTML&lt;/em&gt;, and &lt;em&gt;feature_image.&lt;/em&gt; Running this query allows us to use this data in our template as part of the &lt;code&gt;data&lt;/code&gt; object being passed to &lt;code&gt;Post&lt;/code&gt;. If we wanted our post to include information like the name of the author, we'd have to add that field to our query as well.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PropTypes&lt;/strong&gt; : We need to type-check the results of our GraphQL query before we  can utilize this data in our page. We associate each item of data with the data type we're expecting by setting PropTypes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Template Structure&lt;/strong&gt; : &lt;code&gt;Post&lt;/code&gt; is the JSX which will ultimately output each post page. This is essentially a React Component that is accepting a parameter called "data," which is the data we grabbed in our GraphQL query, &lt;code&gt;postQuery&lt;/code&gt;. Take note of how we build our template in JSX and include the data we decided was important to include, such as &lt;code&gt;{ post.title }&lt;/code&gt; or &lt;code&gt;{ post.feature_image }&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Components
&lt;/h3&gt;

&lt;p&gt;A &lt;em&gt;component&lt;/em&gt; is a reusable block of code typically shared by multiple pages, such as widgets or navigation items (a better term for these would be "partials"). Partials are reusable code intended to be shared by pages and templates. For example, I have a component called &lt;code&gt;AuthorCard&lt;/code&gt;, which details the information of a single author:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;React&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;react&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;PropTypes&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;prop-types&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Link&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gatsby&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;AuthorCard&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;author&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;headerClass&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;authorTwitterUrl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;author&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;twitter&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="s2"&gt;`https://twitter.com/&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;author&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;twitter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/^@/&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;``&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;authorFacebookUrl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;author&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;facebook&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="s2"&gt;`https://www.facebook.com/&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;author&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;facebook&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/^&lt;/span&gt;&lt;span class="se"&gt;\/&lt;/span&gt;&lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;``&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;classes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;headerClass&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="s2"&gt;`author-card info-card`&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`author-card`&lt;/span&gt;

  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="o"&gt;&amp;lt;&amp;gt;&lt;/span&gt;
        &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;header&lt;/span&gt; &lt;span class="nx"&gt;className&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;classes&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;div&lt;/span&gt; &lt;span class="nx"&gt;className&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;author-card-image&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;author&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;profile_image&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;img&lt;/span&gt; &lt;span class="nx"&gt;className&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;lazyload&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;src&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;author&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;profile_image&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="nx"&gt;alt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;author&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="sr"&gt;/&amp;gt; : &amp;lt;FontAwesomeIcon icon="user-edit" size="sm" /&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
          &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/div&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;          &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;div&lt;/span&gt; &lt;span class="nx"&gt;className&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;author-card-content&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;Link&lt;/span&gt; &lt;span class="nx"&gt;to&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;`/author/&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;author&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;slug&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="nx"&gt;className&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;author-card-name&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;author&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/Link&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;            &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;div&lt;/span&gt; &lt;span class="nx"&gt;className&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;author-card-meta&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
              &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;author&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;postCount&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;span&lt;/span&gt; &lt;span class="nx"&gt;className&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;author-card-item&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;author&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;postCount&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="nx"&gt;Posts&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/span&amp;gt;&lt;/span&gt;&lt;span class="err"&gt;}
&lt;/span&gt;              &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;author&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;span&lt;/span&gt; &lt;span class="nx"&gt;className&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;author-card-item&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;author&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/span&amp;gt;&lt;/span&gt;&lt;span class="err"&gt;}
&lt;/span&gt;              &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;author&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;website&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="nx"&gt;className&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;author-card-item&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="nx"&gt;href&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;author&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;website&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="nx"&gt;target&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;_blank&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="nx"&gt;rel&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;noopener noreferrer&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="nx"&gt;Website&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/a&amp;gt;&lt;/span&gt;&lt;span class="err"&gt;}
&lt;/span&gt;              &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;authorTwitterUrl&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="nx"&gt;className&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;author-card-item&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="nx"&gt;href&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;authorTwitterUrl&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="nx"&gt;target&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;_blank&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="nx"&gt;rel&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;noopener noreferrer&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="nx"&gt;Twitter&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/a&amp;gt;&lt;/span&gt;&lt;span class="err"&gt;}
&lt;/span&gt;              &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;authorFacebookUrl&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="nx"&gt;className&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;author-card-item&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="nx"&gt;href&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;authorFacebookUrl&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="nx"&gt;target&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;_blank&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="nx"&gt;rel&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;noopener noreferrer&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="nx"&gt;Facebook&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/a&amp;gt;&lt;/span&gt;&lt;span class="err"&gt;}
&lt;/span&gt;            &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/div&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;            &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;author&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;bio&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt; &lt;span class="nx"&gt;className&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;author-card-bio&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;author&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;bio&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/p&amp;gt;&lt;/span&gt;&lt;span class="err"&gt;}
&lt;/span&gt;          &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/div&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;        &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/header&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;      &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;  &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;AuthorCard&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;propTypes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;author&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;PropTypes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;PropTypes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;string&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;isRequired&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;bio&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;PropTypes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;string&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;isRequired&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;profile_image&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;PropTypes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;website&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;PropTypes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;twitter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;PropTypes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;facebook&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;PropTypes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;postCount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;PropTypes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;number&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;location&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;PropTypes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;slug&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;PropTypes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;}).&lt;/span&gt;&lt;span class="nx"&gt;isRequired&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;headerClass&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;PropTypes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;boolean&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="nx"&gt;AuthorCard&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
/src/components/authors/AuthorCard.js





&lt;h4&gt;
  
  
  Components and GraphQL
&lt;/h4&gt;

&lt;p&gt;Component files are structured in the same way as templates, with a fundamental difference: components cannot create &lt;em&gt;dynamic&lt;/em&gt; GraphQL queries_._&lt;/p&gt;

&lt;p&gt;The structure of &lt;code&gt;AuthorCard&lt;/code&gt; has the same fundamental structure as our &lt;code&gt;Post&lt;/code&gt; template, but &lt;code&gt;AuthorCard&lt;/code&gt; does not have its own GraphQL query. &lt;code&gt;AuthorCard&lt;/code&gt; instead is able to accept a parameter while being defined; this means whichever page/template contains this partial can simply pass data from the parent page's GraphQL queries into child components.&lt;/p&gt;

&lt;p&gt;In the &lt;code&gt;AuthorCard&lt;/code&gt; example, one of our input parameters is called &lt;strong&gt;author&lt;/strong&gt; which seems to contain all the author-related data we need!  To do this, we can import our author card into our post template and include it in &lt;code&gt;Post&lt;/code&gt;'s JSX:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;AuthorCard&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;../components/authors&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;Post&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;post&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ghostPost&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;author&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ghostAuthor&lt;/span&gt;

  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;
    &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;AuthorCard&lt;/span&gt; &lt;span class="nx"&gt;author&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;author&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;    &lt;span class="p"&gt;...&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
/src/templates/post.js





&lt;p&gt;&lt;code&gt;author&lt;/code&gt; is looking for &lt;code&gt;data.ghostAuthor&lt;/code&gt;, which we get by expanding on our post's GraphQL query:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;postQuery&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;graphql&lt;/span&gt;&lt;span class="s2"&gt;`
  query($slug: String!, $primaryAuthor: String!) {
    ...
    ghostAuthor(slug: {eq: $primaryAuthor}) {
      postCount
      location
      facebook
      cover_image
      bio
      name
      slug
      twitter
      website
      profile_image
    }
    ....
  }
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
/src/templates/post.js





&lt;p&gt;The thinking here is that &lt;em&gt;templates should pass contextual data on to their child components&lt;/em&gt;. There is sanity in keeping our GraphQL queries on the templates that utilize them, as opposed to letting templates &lt;em&gt;and their children&lt;/em&gt; pull data independently of one another.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;AuthorCard&lt;/code&gt; now has contextual information about the author.&lt;/p&gt;

&lt;h4&gt;
  
  
  Static Queries in Components
&lt;/h4&gt;

&lt;p&gt;There are cases where components &lt;em&gt;can&lt;/em&gt; execute GraphQL queries, but only under the circumstance that they pull data which is not contextual. In other words, these components can only run GraphQL queries that do not utilize variables. These queries are called &lt;strong&gt;Static Queries&lt;/strong&gt;. It's best not to linger on this topic, but here's an example of where a static query is used for site-wide metadata in our Ghost Gatsby template:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;MetaDataQuery&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;props&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;StaticQuery&lt;/span&gt;
    &lt;span class="na"&gt;query&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;graphql&lt;/span&gt;&lt;span class="s2"&gt;`
      query GhostSettingsMetaData {
        allGhostSettings {
          edges {
            node {
              title
              description
            }
          }
        }
      }
    `&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
    &lt;span class="na"&gt;render&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;MetaData&lt;/span&gt; &lt;span class="na"&gt;settings&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;props&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="nx"&gt;MetaDataQuery&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
/src/components/common/meta/MetaData.js





&lt;h3&gt;
  
  
  Pages
&lt;/h3&gt;

&lt;p&gt;The third and final type of layout in GatsbyJS are &lt;em&gt;pages&lt;/em&gt;, not to be confused with &lt;em&gt;templates&lt;/em&gt;. Where templates are reusable, Gatsby &lt;em&gt;pages&lt;/em&gt; are pages that will only ever exist once on our site, such as an error page or transactional confirmation. The syntax for creating a page is identical to that of creating a template.&lt;/p&gt;

&lt;p&gt;Every page we create will inevitably require some standard information. No matter what our page is for, it's going to need a title, some metadata, and a URL (obviously). Ghost provides us with a number of &lt;a href="https://hackersandslackers.com/creating-updating-and-deleting-data-via-graphql-mutations/" rel="noopener noreferrer"&gt;GraphQL Fragments&lt;/a&gt; to help us grab all properties of a page (or post) at once:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;pageQuery&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;graphql&lt;/span&gt;&lt;span class="s2"&gt;`
  query GhostPageQuery($slug: String) {
    ghostPage(slug: {eq: $slug}) {
      ...GhostPageFields
    }
  }
`&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
/src/templates/page.js





&lt;h2&gt;
  
  
  Gatsby Configuration &amp;amp; Plugins
&lt;/h2&gt;

&lt;p&gt;Cruising through the &lt;strong&gt;src&lt;/strong&gt; folder gives us a pretty good idea of how to modify the structure the pages our site will serve. That's great, but where does the data feeding these pages actually &lt;em&gt;come&lt;/em&gt; from? How do our components know the data source we're querying? Without any data model configuration on our side, our components already recognize things like &lt;code&gt;ghostPage&lt;/code&gt; and &lt;code&gt;ghostPost&lt;/code&gt; as data types.&lt;/p&gt;

&lt;p&gt;Sourcing data to Gatsby happens in a magic file called &lt;strong&gt;gatsby-config.js&lt;/strong&gt;. Gatsby is configured by installing and tweaking an entire ecosystem of &lt;a href="https://www.gatsbyjs.org/plugins/" rel="noopener noreferrer"&gt;Gatsby plugins&lt;/a&gt;, and some of those plugins tell Gatsby where to look for our data. If you're familiar with Webpack, &lt;strong&gt;gatsby-config&lt;/strong&gt; is essentially identical to a Webpack configuration file. A few examples of what our theme already includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;gatsby-plugin-feed&lt;/strong&gt; : Generates a highly-configurable RSS feed for our site.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;gatsby-plugin-advanced-sitemap&lt;/strong&gt; : Serves an SEO-friendly sitemap.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;gatsby-plugin-react-helmet&lt;/strong&gt; : Provides a JSX element to easily set metadata per page.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are all fine and necessary, but the plugins we want to pay special attention to are the ones with the prefix &lt;em&gt;"gatsby-source-"&lt;/em&gt;. Our config has a few of these by default: &lt;strong&gt;gatsby-source-filesystem&lt;/strong&gt; , and &lt;strong&gt;gatsby-source-ghost&lt;/strong&gt; :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;resolve&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`gatsby-source-filesystem`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nl"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;__dirname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;`src`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;`images`&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`images`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;resolve&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`gatsby-source-ghost`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;options&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;NODE_ENV&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="s2"&gt;`development`&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt;
    &lt;span class="nx"&gt;ghostConfig&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;development&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nx"&gt;ghostConfig&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;production&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
gatsby-config.js





&lt;p&gt;&lt;strong&gt;gatsby-source-filesystem&lt;/strong&gt; is a plugin that sources content from our local file structure. In the above example, it's being used to serve images from a local folder. If we wanted to, we could source our entire Gatsby site from locally saved Markdown files. Luckily, we aren't the types of savage barbarians who blog by building Jekyll sites. We're gentlemen, just as the Great Gatsby himself.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;gatsby-source-ghost&lt;/strong&gt; allows us to query content from Ghost sites. Simply installing this plugin gives us access to Ghost data models in our GraphQL queries. In terms of knowing &lt;em&gt;which&lt;/em&gt; Ghost admin to source from, this is what we handled when we configured &lt;strong&gt;.ghost.json&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;As it turns out, sourcing content to Gatsby is perhaps one of it's most alluring features. Our configuration is already pulling from &lt;em&gt;two&lt;/em&gt; content sources, and adding a third source would be as easy as installing a plugin. We're not just talking about multiple CMS sources; Gatsby allows us to source content from Github repositories, Tweets, JIRA, or even databases directly. Each "source" plugin we install gives us opportunities to create powerful associations between our data, joining information from different sources without ever touching a database.&lt;/p&gt;

&lt;h3&gt;
  
  
  Data Sources in Action
&lt;/h3&gt;

&lt;p&gt;You've probably noticed the prompt Gatsby gives after running &lt;code&gt;gatsby develop&lt;/code&gt;, which encourages you to explore your site's data schema at &lt;a href="http://localhost:8000/___graphql" rel="noopener noreferrer"&gt;&lt;code&gt;http://localhost:8000/___graphql&lt;/code&gt;&lt;/a&gt;. This GraphQL playground is your best friend: the easiest way to understand the resulting schemas of sources you configure is via this interface. Here's what my schema looks like after installing &lt;strong&gt;gatsby-source-git&lt;/strong&gt; and *&lt;em&gt;@gatsby-contrib/gatsby-transformer-ipynb  *&lt;/em&gt; to pull and parse Jupyter notebooks from a Github repo:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fstorage.googleapis.com%2Fhackersandslackers-cdn%2F2019%2F09%2Fgatsby-jupyter-source-1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fstorage.googleapis.com%2Fhackersandslackers-cdn%2F2019%2F09%2Fgatsby-jupyter-source-1.png" alt="Understanding GatsbyJS: Create Your First Gatsby Theme"&gt;&lt;/a&gt;A GraphQL query pulling .ipynb files from Github.&lt;/p&gt;

&lt;p&gt;Adding two plugins is the only configuration needed to build this query. Here's what we just achieved with minimal effort:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Gatsby recognized that files were added to our local file structure and provided us with information about said files (such as file name, extension, etc.). We can query all of these attributes.&lt;/li&gt;
&lt;li&gt;Of the local files Gatsby found, our newly added plugin identified &lt;strong&gt;.ipynb&lt;/strong&gt; files as &lt;em&gt;Jupyter Notebooks.&lt;/em&gt; This allows us to query Jupyter-specific attributes of those files, in addition to the general information we already had accessible.&lt;/li&gt;
&lt;li&gt;Gatsby &lt;em&gt;also&lt;/em&gt; recognizes that these Jupyter files were sourced from Github, so we can pull repository-level metadata about where these pages were sourced from.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is &lt;em&gt;absolutely insane&lt;/em&gt;. It's difficult to express how powerful this is in words, so I won't even try. Let's move on.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building Routes in Gatsby
&lt;/h2&gt;

&lt;p&gt;We now know how Gatsby sources its data, and how Gatsby eventually creates pages from that data. The third and final piece of our puzzle is between these two layers. This piece of our site handles the URL patterns and routing of the pages we create, and it all happens in &lt;strong&gt;gatsby-node.js&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Before our site can be built, we need to know how it'll be structured. Blogs in particular share a few common patterns. They usually have paginated lists of posts, author profiles, and "tag" pages where posts sharing a particular can all be viewed at once. We happen to be in luck because our Ghost starter template handles all of these things for us. As briefly as possible, the chain of events happening in &lt;strong&gt;gatsby-node&lt;/strong&gt; is like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Content sources are queried for &lt;em&gt;only the information necessary&lt;/em&gt; to build routes to our pages&lt;/li&gt;
&lt;li&gt;The queried data is split into a single segment per template type. For example, we extract the names of all the posts we'll publish by setting &lt;code&gt;const posts = result.data.allGhostPost.edges&lt;/code&gt;. The same is done for author pages, static pages, etc.&lt;/li&gt;
&lt;li&gt;With our data grouped 1-to-1 with the pages they create, we then loop through each group to call a &lt;code&gt;createPage&lt;/code&gt; function. Let's use posts as an example. In this step, we're telling Gatsby to create a page using the &lt;code&gt;post.js&lt;/code&gt; template for each "post" we pull in GraphQL. A part of this process is passing the URL structure of where each of these generated pages will live.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;There's a lot to take in here. Luckily for us, our template already handles the heavy-lifting of creating a site structure for us. When the time comes to add a new static page or grouping of templates, following the format of what already exists in &lt;strong&gt;gatsby-node.js&lt;/strong&gt; is relatively straightforward.&lt;/p&gt;

&lt;h3&gt;
  
  
  Page Contexts in gatsby-node
&lt;/h3&gt;

&lt;p&gt;If there's one pitfall of working through the &lt;strong&gt;gatsby-node&lt;/strong&gt; file, it would be the concept of "page contexts". Let's look at the &lt;code&gt;createPage&lt;/code&gt; function I have for creating posts:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nf"&gt;createPage&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;component&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;postTemplate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Data passed to context is available&lt;/span&gt;
    &lt;span class="c1"&gt;// in page queries as GraphQL variables.&lt;/span&gt;
    &lt;span class="na"&gt;slug&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;slug&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;primaryAuthor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;primary_author&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;slug&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;primaryTag&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;primary&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;seriesSlug&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;series&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;seriesTitle&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
gatsby-node.js





&lt;p&gt;The first two parameters of &lt;code&gt;createPage&lt;/code&gt; are simple: &lt;code&gt;path&lt;/code&gt; determines the route of this instance of a page, and &lt;code&gt;component&lt;/code&gt; refers to whichever React component in &lt;strong&gt;src&lt;/strong&gt; we want to build the page with.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;context&lt;/code&gt; is where things get interesting. Variables passed into a page context can be accessed by the target template in two ways. The first way is as a variable in the template's GraphQL query (this is how we see things like &lt;code&gt;query($slug: String!)&lt;/code&gt; ). Passing variables to pages is useful if a page contains features that depend on knowing more about &lt;em&gt;which instance of a page it is&lt;/em&gt;. For example, I pass &lt;code&gt;primaryTag&lt;/code&gt; to posts as a way of querying other posts with the same primary tag to build a related posts widget.&lt;/p&gt;

&lt;p&gt;We're getting way too deep here. I won't even mention the &lt;code&gt;pageContext&lt;/code&gt; object, which gets passed into templates for purposes of things like pagination. Let's move on.&lt;/p&gt;

&lt;h2&gt;
  
  
  What About Frontend Javascript?
&lt;/h2&gt;

&lt;p&gt;Client-side JS should be a last resort when building static sites, but there are times when it needs to happen. &lt;strong&gt;gatsby-browser&lt;/strong&gt; allows us to execute client-side Javascript in response to browser events like &lt;code&gt;onRouteUpdate()&lt;/code&gt;, which is triggered each time a user changes pages. This is how we can implement code syntax highlighting, for example.&lt;/p&gt;

&lt;p&gt;The full list of browser events we can use to trigger scripts can be found &lt;a href="https://www.gatsbyjs.org/docs/browser-apis/" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Is Gatsby THAT Great Tho?
&lt;/h2&gt;

&lt;p&gt;It's right to be skeptical of any new technology. This is &lt;em&gt;especially&lt;/em&gt; the case with JavaScript frameworks, the hype cycle of which has become a meme in itself . After writing over 4 thousand words attempting to explain the internals of Gatsby &lt;em&gt;at a high-level&lt;/em&gt;, it's clear that Gatsby is architecturally complicated. For newer devs who might not have previous experience with React, GraphQL, or Webpack, I can only imagine how one can feel at the bottom of the mountain looking up.&lt;/p&gt;

&lt;p&gt;For more experienced developers, &lt;em&gt;Gatsby is totally that great&lt;/em&gt;. Gatsby improves on so many aspects of modern web development that it's difficult to summarize &lt;em&gt;why&lt;/em&gt; Gatsby is so great to those for whom it is suited. Praising "faster load times" doesn't do justice to the efficient, painless ecosystem of GatsbyJS. This is a rare moment where I'd argue that a framework lives up to the hype, at the very least.&lt;/p&gt;

&lt;p&gt;That said, we need to acknowledge the implications that things like Gatsby creates for developers as a whole. For those of us who've grown up with Javascript's nuances and frameworks, learning Gatsby is a manageable step forward. It's easy to neglect that this is only true because we've accrued a lifetime of related knowledge before this point. This knowledge puts us in a favorable position to learn &lt;em&gt;one more thing&lt;/em&gt;. If we were to imagine being on the outside looking in, it feels like Gatsby is another layer of "things to know" in the comical collection of nonsense that is Javascript. While I'm an advocate of Gatsby, it's important to recognize that learning Gatsby is a privilege of circumstance. Most employed developers work for enterprises that can not (nor ever should) consider major changes to their technology stacks. It's unreasonable to think "this is the direction the world is going," because most people in the world are preoccupied with making the world work. And families, or whatever.&lt;/p&gt;

&lt;p&gt;Anyway, Gatsby is excellent if you're in any position to pick it up. Ask yourself, are you:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Working for a young startup that uses Node?&lt;/li&gt;
&lt;li&gt;A student school and have a lot of time?&lt;/li&gt;
&lt;li&gt;A self-destructive personality that causes them to stay up until 4am every night to learn new frameworks just to post about them?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If you answered "yes" to any of these, then picking up Gatsby is definitely worth your time.&lt;/p&gt;

</description>
      <category>gatsby</category>
      <category>jamstack</category>
      <category>javascript</category>
      <category>softwaredevelopment</category>
    </item>
    <item>
      <title>Building Java Projects with Gradle</title>
      <dc:creator>Todd Birchard</dc:creator>
      <pubDate>Tue, 09 Jun 2020 14:14:00 +0000</pubDate>
      <link>https://dev.to/hackersandslackers/building-java-projects-with-gradle-nah</link>
      <guid>https://dev.to/hackersandslackers/building-java-projects-with-gradle-nah</guid>
      <description>&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fhackersandslackers-cdn.storage.googleapis.com%2F2020%2F02%2Fjava-gradle-1.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fhackersandslackers-cdn.storage.googleapis.com%2F2020%2F02%2Fjava-gradle-1.jpg" alt="Building Java Projects with Gradle"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I've had a few strongly worded opinions about Java as a language in the past. Be that as it may, choosing a programming language is a luxury that many people don’t have; as long as enterprises exist, there will always be a need for Java developers. According to the &lt;a href="https://insights.stackoverflow.com/survey/2019" rel="noopener noreferrer"&gt;2019 StackOverflow developer survey&lt;/a&gt;, about 40% of developers are actively using Java in some way.&lt;/p&gt;

&lt;p&gt;For those who work mostly in more "modern" programming languages, coming to Java in 2019 has numerous pain points. Installing dependencies by manually downloading and dropping jar files into your Java path is a humbling look into the past where package managers were non-existent. Luckily for us, there are tools like Gradle to help us bridge the gap between older programming languages and the processes we're used to.&lt;/p&gt;

&lt;h2&gt;
  
  
  Java Build Tools
&lt;/h2&gt;

&lt;p&gt;Build tools are useful to Java developers in that they automate many processes associated with packaging a build which would otherwise be manual. Build tools not only compile a project's source code, but also download dependencies and run tests. Java build tools aren't exactly a sexy part of anybody's stack - it can be argued that they simply accomplish the things that we would expect a modern programming language to have natively.&lt;/p&gt;

&lt;p&gt;Three build tools have historically dominated the scene: &lt;a href="https://ant.apache.org/" rel="noopener noreferrer"&gt;Apache Ant&lt;/a&gt; (released 2000), &lt;a href="https://maven.apache.org/" rel="noopener noreferrer"&gt;Apache Maven&lt;/a&gt; (released around 2004), and &lt;a href="https://gradle.org/" rel="noopener noreferrer"&gt;Gradle&lt;/a&gt; (released 2007). We won't waste time comparing these tools since Gradle is objectively better than the other options. If you're curious as to why, I'll humor you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Ease-of-use&lt;/strong&gt; : Gradle build scripts are written in Groovy (Ant and Maven are configured via XML files... enough said?)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance&lt;/strong&gt; : Gradle creates builds faster by only building necessary changes from build-to-build, reusing build outputs from previous builds, and leveraging a running daemon process.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Debugging&lt;/strong&gt; : Part of Gradle's build process is outputting &lt;em&gt;very&lt;/em&gt; useful HTML logs detailing anything that went wrong.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Extensible&lt;/strong&gt; : Gradle has a vast ecosystem of available &lt;a href="https://plugins.gradle.org/" rel="noopener noreferrer"&gt;plugins&lt;/a&gt; which open plenty of opportunities including support for Java, C++, or Python.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Installing Gradle
&lt;/h2&gt;

&lt;p&gt;If you haven't done so already, go ahead and install Gradle using homebrew&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;brew &lt;span class="nb"&gt;install &lt;/span&gt;gradle
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Installing Gradle via homebrew will automatically configure your path and all that nonsense. Verify that Gradle was installed correctly and you should be good to go:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;gradle &lt;span class="nt"&gt;-v&lt;/span&gt;

&lt;span class="nt"&gt;------------------------------------------------------------&lt;/span&gt;
Gradle 5.5.1
&lt;span class="nt"&gt;------------------------------------------------------------&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Initiating a Java Project with Gradle
&lt;/h2&gt;

&lt;p&gt;Starting a Java project under normal circumstances is notably obnoxious when compared to other programming languages. Java's philosophy around namespaces results in ridiculously complex folder structures for any Java project, even a simple &lt;em&gt;hello world&lt;/em&gt; app. Perhaps the most undersold feature of Gradle is the ability to generate cookie-cutter projects to get started.&lt;/p&gt;

&lt;p&gt;Create a new directory using &lt;code&gt;mkdir myProject&lt;/code&gt; (or whatever you want your project to be).  &lt;code&gt;cd&lt;/code&gt; into that empty directory and run Gradle's init script:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;gradle init
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Gradle is going to prompt us with a few questions to get more context about what we're creating. The first question Gradle asks us to select a &lt;em&gt;type&lt;/em&gt; of project:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Select &lt;span class="nb"&gt;type &lt;/span&gt;of project to generate:
  1: basic
  2: application
  3: library
  4: Gradle plugin
Enter selection &lt;span class="o"&gt;(&lt;/span&gt;default: basic&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;1..4] 2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Creating a &lt;strong&gt;basic&lt;/strong&gt; project only initiates a bare minimum project, which isn't particularly useful. Creating an &lt;strong&gt;application&lt;/strong&gt; , on the other hand, will initiate a barebones folder structure for the language of your choice.&lt;/p&gt;

&lt;p&gt;Select &lt;strong&gt;application (2)&lt;/strong&gt; for your type of project. This should prompt three more multiple-choice questions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Select implementation language:
  1: C++
  2: Groovy
  3: Java
  4: Kotlin
Enter selection &lt;span class="o"&gt;(&lt;/span&gt;default: Java&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;1..4] 3

Select build script DSL:
  1: Groovy
  2: Kotlin
Enter selection &lt;span class="o"&gt;(&lt;/span&gt;default: Groovy&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;1..2] 1

Select &lt;span class="nb"&gt;test &lt;/span&gt;framework:
  1: JUnit 4
  2: TestNG
  3: Spock
  4: JUnit Jupiter
Enter selection &lt;span class="o"&gt;(&lt;/span&gt;default: JUnit 4&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;1..4] 1

Project name &lt;span class="o"&gt;(&lt;/span&gt;default: java-gradle-tutorial&lt;span class="o"&gt;)&lt;/span&gt;: 

Source package &lt;span class="o"&gt;(&lt;/span&gt;default: java-gradle-tutorial&lt;span class="o"&gt;)&lt;/span&gt;: 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Since we selected &lt;strong&gt;application&lt;/strong&gt; as our project type, Gradle lets us chose from one of four programming languages to build our project (selecting &lt;strong&gt;basic&lt;/strong&gt; wouldn't have prompted us with this).&lt;/p&gt;

&lt;p&gt;Next, we're able to select either Groovy or Kontlin as the language in which our &lt;em&gt;build scripts&lt;/em&gt; will be written in. Most people choose Groovy as it's an easy language to pick up, especially in this context.&lt;/p&gt;

&lt;p&gt;Lastly, we're asked to select a test framework to run our unit tests in. JUnit 4 is the most popular, and it's good enough for me.&lt;/p&gt;

&lt;p&gt;Check out our folder structure now:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/java-gradle-tutorial
├── build.gradle
├── gradle
│ └── wrapper
│ ├── gradle-wrapper.jar
│ └── gradle-wrapper.properties
├── gradlew
├── gradlew.bat
├── settings.gradle
└── src
    ├── main
    │ ├── java
    │ │ └── com
    │ │ └── hackersandslackers
    │ │ └── gradletutorial
    │ │ └── App.java
    │ └── resources
    └── &lt;span class="nb"&gt;test&lt;/span&gt;
        ├── java
        │ └── com
        │ └── hackersandslackers
        │ └── gradletutorial
        │ └── AppTest.java
        └── resources
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
/java-gradle-tutorial





&lt;p&gt;Gradle has started our project by generating the barebones of our project as well as some Gradle-related configuration files. Not only does Gradle create our main module within /src, but it also creates the corresponding folder structure needed to test said module. If we were to create this project without Gradle, we would've had to create 10 directories!&lt;/p&gt;

&lt;p&gt;Let's take a look at the files related to using Gradle in our project:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;build.gradle&lt;/strong&gt; : This is the file we'll use to configure our builds. This is where we specify things like dependencies to download, tasks to run, projects to import, etc. This is the file we'll do the most work in.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;settings.gradle&lt;/strong&gt; : Additional settings for our Gradle build.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;gradlew&lt;/strong&gt; : &lt;code&gt;gradlew&lt;/code&gt; stands for "Gradle wrapper" and is the file we'll use to execute our builds (for example: &lt;code&gt;$ ./gradlew build&lt;/code&gt;). When we initialize Gradle in a project we actually decouple our project's Gradle from our system's Gradle, which means we could hand off our source to somebody who doesn't have Gradle installed and they'd still be able to build our project.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Core Tasks for Java Projects
&lt;/h2&gt;

&lt;p&gt;Before we get into any customization/configuration stuff, let's see what Gradle offers us out of the box! Typing &lt;code&gt;./gradlew tasks&lt;/code&gt; in your project directory lists which &lt;strong&gt;tasks&lt;/strong&gt; you can run in your project, such as building or running your code. Because we initialized Gradle as a Java project, we have this specific list of tasks:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;./gradlew tasks

&lt;span class="nt"&gt;------------------------------------------------------------&lt;/span&gt;
Tasks runnable from root project
&lt;span class="nt"&gt;------------------------------------------------------------&lt;/span&gt;

Application tasks
&lt;span class="nt"&gt;-----------------&lt;/span&gt;
run - Runs this project as a JVM application

Build tasks
&lt;span class="nt"&gt;-----------&lt;/span&gt;
assemble - Assembles the outputs of this project.
build - Assembles and tests this project.
buildDependents - Assembles and tests this project and all projects that depend on it.
buildNeeded - Assembles and tests this project and all projects it depends on.
classes - Assembles main classes.
clean - Deletes the build directory.
jar - Assembles a jar archive containing the main classes.
testClasses - Assembles &lt;span class="nb"&gt;test &lt;/span&gt;classes.

Build Setup tasks
&lt;span class="nt"&gt;-----------------&lt;/span&gt;
init - Initializes a new Gradle build.
wrapper - Generates Gradle wrapper files.

Distribution tasks
&lt;span class="nt"&gt;------------------&lt;/span&gt;
assembleDist - Assembles the main distributions
distTar - Bundles the project as a distribution.
distZip - Bundles the project as a distribution.
installDist - Installs the project as a distribution as-is.

Documentation tasks
&lt;span class="nt"&gt;-------------------&lt;/span&gt;
javadoc - Generates Javadoc API documentation &lt;span class="k"&gt;for &lt;/span&gt;the main &lt;span class="nb"&gt;source &lt;/span&gt;code.

Verification tasks
&lt;span class="nt"&gt;------------------&lt;/span&gt;
check - Runs all checks.
&lt;span class="nb"&gt;test&lt;/span&gt; - Runs the unit tests.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No matter what you're building, you'll be using a handful of these tasks all the time. Here are a few of the most commonly used tasks.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;./gradlew build&lt;/code&gt; will compile your project's code into a &lt;em&gt;/build&lt;/em&gt; folder.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;./gradlew run&lt;/code&gt; will run the compiled code in your build folder.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;./gradlew clean&lt;/code&gt; will purge that build folder.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;./gradlew test&lt;/code&gt; will execute unit tests without building or running your code again.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each of these tasks can be chained together for convenience. For example, &lt;code&gt;./gradlew clean build run&lt;/code&gt; will create a new build of your Java project from scratch and then run said project.&lt;/p&gt;

&lt;p&gt;These are just the core tasks that come standard with Java applications; we can script our own tasks as well. For that, we'll need to get deeper into configuring Gradle.&lt;/p&gt;

&lt;h2&gt;
  
  
  Configuring &amp;amp; Customizing Gradle
&lt;/h2&gt;

&lt;p&gt;Gradle is much more useful to us when we can configure it to do things like download dependencies, import other projects, and execute specific parts of our code at runtime. All of this magic is contained in a &lt;strong&gt;build.gradle&lt;/strong&gt; file, which is usually made up of 3 major parts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Tasks&lt;/strong&gt; are scripts Gradle can execute. The core tasks available to Java applications will suit most needs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dependencies&lt;/strong&gt; are third-party .jar files to fetch from a repository and package with your project.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Plugins&lt;/strong&gt; are Gradle plugins for additional functionality (we should have the &lt;code&gt;java&lt;/code&gt; and &lt;code&gt;application&lt;/code&gt; plugins active by default).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Projects&lt;/strong&gt; are standalone applications being packaged together in a single build. Not all Gradle builds consist of multiple projects, but multi-project builds are a big plus of Gradle.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Before we even jump into those, we need to tell Gradle where the main class of our application lives.&lt;/p&gt;

&lt;h3&gt;
  
  
  Setting a Main Class
&lt;/h3&gt;

&lt;p&gt;The first thing we should set in &lt;strong&gt;build.gradle&lt;/strong&gt; is a variable named &lt;code&gt;mainClassName&lt;/code&gt;. This is the path to our main Java class relative to the current file. Each time our build runs, Gradle will look for this class in &lt;code&gt;src/java&lt;/code&gt; to start our application. It's important to note that &lt;code&gt;mainClassName&lt;/code&gt; excepts the package name to be part of the path to our class, so at the very least we need our &lt;code&gt;mainClassName&lt;/code&gt; to contain &lt;code&gt;[PACKAGE_NAME].[CLASS_NAME]&lt;/code&gt;.  Here's my main class name:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight groovy"&gt;&lt;code&gt;&lt;span class="n"&gt;mainClassName&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"com.hackersandslackers.gradletutorial.App"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So my class is named &lt;code&gt;Main&lt;/code&gt; and lives in a package called &lt;code&gt;com.hackersandslackers.gradletutorial&lt;/code&gt;. If you recall the folder structure that Gradle created for us, the inside of &lt;strong&gt;src&lt;/strong&gt; reflects this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/src
└── main
    └── java
        └── com
            └── hackersandslackers
                └── gradletutorial
                    └── App.java
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
/src directory





&lt;p&gt;Lastly, make sure you set your package name in your Main.java file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.hackersandslackers.gradletutorial&lt;/span&gt;

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;App&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;getGreeting&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"Hello world."&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;App&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="na"&gt;getGreeting&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Main.java





&lt;h3&gt;
  
  
  Gradle Plugins
&lt;/h3&gt;

&lt;p&gt;Gradle provides us with a few "core" plugins out-of-the-box to build Java applications. Unsurprisingly, the two we have are named "Java" and "Application":&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight groovy"&gt;&lt;code&gt;&lt;span class="n"&gt;plugins&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="s1"&gt;'java'&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="s1"&gt;'application'&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;apply&lt;/span&gt; &lt;span class="nl"&gt;plugin:&lt;/span&gt;&lt;span class="s1"&gt;'application'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
build.gradle





&lt;p&gt;To understand what these two plugins do, try deleting them from your &lt;strong&gt;build.gradle&lt;/strong&gt; file and running the same &lt;code&gt;./gradlew tasks&lt;/code&gt; command we ran earlier:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;./gradlew tasks

&lt;span class="nt"&gt;------------------------------------------------------------&lt;/span&gt;
Tasks runnable from root project
&lt;span class="nt"&gt;------------------------------------------------------------&lt;/span&gt;

Build Setup tasks
&lt;span class="nt"&gt;-----------------&lt;/span&gt;
init - Initializes a new Gradle build.
wrapper - Generates Gradle wrapper files.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All our tasks are gone! The &lt;code&gt;java&lt;/code&gt; and &lt;code&gt;application&lt;/code&gt; plugins actually contain basically &lt;em&gt;all&lt;/em&gt; the useful things we gained from initializing Gradle in the first place. As it turns out, Gradle is designed to be modular to the point where Gradle alone is nothing but a wrapper designed to serve as a medium for plugins and custom logic. You can add those plugins back now.&lt;/p&gt;

&lt;h3&gt;
  
  
  Installing Java Dependencies
&lt;/h3&gt;

&lt;p&gt;If you've built Java projects before you're already aware of what a painfully manual process this normally is. Java has no inherent &lt;strong&gt;PyPi&lt;/strong&gt; or &lt;strong&gt;npm&lt;/strong&gt; equivalent: people used to download .jar files  and manually place them into their project directory like a bunch of absolute savages. Gradle does a decent job of making this process easier by handling it in &lt;strong&gt;build.gradle&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;First, we need to set the remote repository we want to download Java packages from. People usually use either &lt;code&gt;mavenCentral()&lt;/code&gt; or &lt;code&gt;jcenter()&lt;/code&gt;, it doesn't really matter:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight groovy"&gt;&lt;code&gt;&lt;span class="o"&gt;...&lt;/span&gt;

&lt;span class="n"&gt;repositories&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;jcenter&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
build.gradle





&lt;p&gt;With our &lt;strong&gt;repositories&lt;/strong&gt; block created, we can then specify which dependencies to download. Below I specify that I'd like my builds to include a MySQL connector, and use the JUnit testing library:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight groovy"&gt;&lt;code&gt;&lt;span class="o"&gt;...&lt;/span&gt;

&lt;span class="n"&gt;repositories&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;jcenter&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;dependencies&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;implementation&lt;/span&gt; &lt;span class="s1"&gt;'com.google.guava:guava:26.0-jre'&lt;/span&gt;
    &lt;span class="n"&gt;compile&lt;/span&gt; &lt;span class="nl"&gt;group:&lt;/span&gt; &lt;span class="s1"&gt;'mysql'&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nl"&gt;name:&lt;/span&gt; &lt;span class="s1"&gt;'mysql-connector-java'&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nl"&gt;version:&lt;/span&gt; &lt;span class="s1"&gt;'5.1.13'&lt;/span&gt;

    &lt;span class="n"&gt;testImplementation&lt;/span&gt; &lt;span class="s1"&gt;'junit:junit:4.12'&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
build.gradle





&lt;h3&gt;
  
  
  Gradle Tasks
&lt;/h3&gt;

&lt;p&gt;A major part of &lt;strong&gt;build.gradle&lt;/strong&gt; is scripting custom &lt;em&gt;tasks&lt;/em&gt;. Tasks are snippets that we can run directly from the command line in our project directory ( as in &lt;code&gt;./gradlew [TASK_NAME]&lt;/code&gt; ). Here's a generic task that prints something:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight groovy"&gt;&lt;code&gt;&lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'hello'&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;doLast&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;println&lt;/span&gt; &lt;span class="s2"&gt;"hello"&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
build.gradle





&lt;p&gt;The value in parenthesis is the name of our task. If our task has no specified name, the task will run during every Gradle build by default.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;doLast&lt;/code&gt; is a built-in action which means that the code in this block will be executed last.  If we wanted an action to occur first, we could use &lt;code&gt;doFirst&lt;/code&gt; instead.&lt;/p&gt;

&lt;p&gt;With that knowledge, here's a task which will run during every build and print "hello" followed by "world":&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight groovy"&gt;&lt;code&gt;&lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;doFirst&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;println&lt;/span&gt; &lt;span class="s2"&gt;"hello"&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;doLast&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;println&lt;/span&gt; &lt;span class="s2"&gt;"world"&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
build.gradle





&lt;p&gt;We can also add some helpful metadata to our tasks for whoever is using the command line and wants to interact with our project. Here we add a group to our task and add some helpful description text:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight groovy"&gt;&lt;code&gt;&lt;span class="n"&gt;tasks&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;register&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"hello"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;group&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Worthless tasks'&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'An utterly useless task'&lt;/span&gt;

    &lt;span class="n"&gt;doLast&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;println&lt;/span&gt; &lt;span class="s1"&gt;'hello'&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
build.gradle





&lt;p&gt;Now when we run &lt;code&gt;./gradlew tasks&lt;/code&gt; in our project directory, they'll be able to see the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;gradle tasks

Worthless tasks
&lt;span class="nt"&gt;-------------&lt;/span&gt;
hello - An utterly useless task
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Multi-project Builds
&lt;/h3&gt;

&lt;p&gt;The last thing worth looking at in Gradle is packaging multiple projects at once. In these types of setups, our top-level directory would contain multiple projects within it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;├── project1/
├── project2/
├── build.gradle
└── settings.gradle
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With two project directories, we can now import each project into our top-level &lt;strong&gt;settings.gradle&lt;/strong&gt; file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight groovy"&gt;&lt;code&gt;&lt;span class="n"&gt;rootProject&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'java-gradle-tutorial'&lt;/span&gt;

&lt;span class="n"&gt;include&lt;/span&gt; &lt;span class="s1"&gt;'project1'&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'project2'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
settings.gradle





&lt;p&gt;Nested projects in multi-project builds can each have their own standalone Gradle configurations with unique tasks and dependencies!&lt;/p&gt;

</description>
      <category>java</category>
      <category>softwaredevelopment</category>
      <category>gradle</category>
      <category>groovy</category>
    </item>
    <item>
      <title>Deploy a Golang Web Application Behind Nginx</title>
      <dc:creator>Todd Birchard</dc:creator>
      <pubDate>Mon, 01 Jun 2020 13:00:00 +0000</pubDate>
      <link>https://dev.to/hackersandslackers/deploy-a-golang-web-application-behind-nginx-3b8i</link>
      <guid>https://dev.to/hackersandslackers/deploy-a-golang-web-application-behind-nginx-3b8i</guid>
      <description>&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fhackersandslackers-cdn.storage.googleapis.com%2F2020%2F05%2Fgolang-nginx-3.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fhackersandslackers-cdn.storage.googleapis.com%2F2020%2F05%2Fgolang-nginx-3.jpg" alt="Deploy a Golang Web Application Behind Nginx"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We started last week strong with a &lt;a href="https://hackersandslackers.com/create-your-first-golang-app/" rel="noopener noreferrer"&gt;foray into Golang&lt;/a&gt;, where we created a simple web app serving a "Hello world" route. For those of you who were enticed by this deviation from our regular programming, the next logical question you might have could be how to make this knowledge &lt;em&gt;"useful"&lt;/em&gt; by making it accessible by other human beings.&lt;/p&gt;

&lt;p&gt;We're going to build on our Golang momentum from last week to do just that: deploy a web application written in Go to a Linux host such as Ubuntu. We're going to cover everything from installing Go, creating a systemctl service, and configuring Nginx. All you need is a VPS.&lt;/p&gt;

&lt;h2&gt;
  
  
  Installation
&lt;/h2&gt;

&lt;p&gt;In case you haven't done so already, make sure your VPS has Nginx installed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;apt update
&lt;span class="nv"&gt;$ &lt;/span&gt;apt upgrade &lt;span class="nt"&gt;-y&lt;/span&gt;
&lt;span class="nv"&gt;$ &lt;/span&gt;apt &lt;span class="nb"&gt;install &lt;/span&gt;nginx
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Install Nginx





&lt;h3&gt;
  
  
  Installing Go on Linux
&lt;/h3&gt;

&lt;p&gt;We're going to install Go via source. Pick the version of Go that suits your Linux distro's needs from the &lt;a href="https://golang.org/dl/" rel="noopener noreferrer"&gt;Go downloads page&lt;/a&gt;. We'll download this to our &lt;strong&gt;/tmp&lt;/strong&gt; folder, build the source, and move the built source to where it belongs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; /tmp
&lt;span class="nv"&gt;$ &lt;/span&gt;wget https://dl.google.com/go/go1.14.3.linux-amd64.tar.gz
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;tar&lt;/span&gt; &lt;span class="nt"&gt;-xvf&lt;/span&gt; go1.11.linux-amd64.tar.gz
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo mv &lt;/span&gt;go /usr/local
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Build Go from source.





&lt;p&gt;We've just unpacked the Go language and moved it to where Linux typically likes to keep its programming languages. This path is what Go refers to as the &lt;strong&gt;GOROOT&lt;/strong&gt; , and its contents should look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/usr/local/go
├── AUTHORS
├── CONTRIBUTING.md
├── CONTRIBUTORS
├── LICENSE
├── PATENTS
├── README.md
├── SECURITY.md
├── VERSION
├── /api
├── /bin
├── /doc
├── favicon.ico
├── /lib
├── /misc
├── /pkg
├── robots.txt
├── /src
└── /test
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Contents of GOROOT





&lt;h3&gt;
  
  
  Add GOPATH and GOROOT to your Shell Script
&lt;/h3&gt;

&lt;p&gt;We've installed and unpacked Go, but we haven't given our OS a way to recognize that we've done so just yet. We can accomplish this by modifying our shell script, which is typically called &lt;strong&gt;.profile&lt;/strong&gt; :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;vim ~/.profile
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Edit ~/.profile





&lt;p&gt;Here we'll add our &lt;strong&gt;GOROOT&lt;/strong&gt; and &lt;strong&gt;GOPATH&lt;/strong&gt; file paths. If you'll recall, &lt;strong&gt;GOROOT&lt;/strong&gt; is where our OS looks for the Go programming language, and &lt;strong&gt;GOPATH&lt;/strong&gt; is the working directory where we keep all Go projects and dependencies. I've chosen to set my &lt;strong&gt;GOPATH&lt;/strong&gt; to &lt;strong&gt;/go&lt;/strong&gt; , which is a directory we have yet to create:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;GOPATH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/go
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;GOROOT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/usr/local/go
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;PATH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$PATH&lt;/span&gt;:&lt;span class="nv"&gt;$GOPATH&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;PATH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$PATH&lt;/span&gt;:&lt;span class="nv"&gt;$GOROOT&lt;/span&gt;/bin
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
~/.profile





&lt;p&gt;Save your changes and activate the shell script:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;source&lt;/span&gt; ~/.profile
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Activate ~/.profile





&lt;p&gt;We can now verify that everything's been installed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;go version
&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; go version go1.14.3 linux/amd64
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Verify Installation





&lt;h2&gt;
  
  
  Setting up our GOPATH &amp;amp; Project
&lt;/h2&gt;

&lt;p&gt;We have to create our &lt;strong&gt;GOPATH&lt;/strong&gt; manually, which is as simple as creating the following directories:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;mkdir&lt;/span&gt; ~/go
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;mkdir&lt;/span&gt; ~/go/bin
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;mkdir&lt;/span&gt; ~/go/pkg
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;mkdir&lt;/span&gt; ~/go/src
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we have a place to keep our Go projects! I'm going to pull down the "Hello world" project we created last week for convenience sake:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; /go
&lt;span class="nv"&gt;$ &lt;/span&gt;go get github.com/hackersandslackers/golang-helloworld
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Fetch a go project





&lt;p&gt;I've chosen to clone the Github repo into my &lt;strong&gt;/src&lt;/strong&gt; path as well, which leaves the structure of my &lt;strong&gt;GOPATH&lt;/strong&gt; looking like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/go
├── /bin
│   └── golang-helloworld
├── /pkg
└── /src
    └── /github.com
        └── /hackersandslackers
            └── /golang-helloworld
                ├── README.md
                ├── go.mod
                ├── go.sum
                ├── golang-helloworld
                ├── main.go
                └── main_test.go
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Contents of GOPATH





&lt;h2&gt;
  
  
  Create an Nginx Config
&lt;/h2&gt;

&lt;p&gt;You may have done this a few times before, but whatever. We're going to set up an Ngnix reverse proxy to listen on the port our app will be running on, which happens to be port &lt;strong&gt;9100&lt;/strong&gt; in our case. Of course, we need to make sure this port is enabled first:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;ufw allow 9100
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Open a port





&lt;p&gt;Create a configuration file in the Nginx &lt;strong&gt;/sites-available&lt;/strong&gt; folder:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;vim /etc/nginx/sites-available/golang-helloworld.conf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Create Nginx config





&lt;p&gt;We're going to drop the standard boilerplate for a reverse proxy here. The domain I happen to be using for this app is &lt;a href="http://golanghelloworld.hackersandslackers.app/" rel="noopener noreferrer"&gt;golanghelloworld.hackersandslackers.app&lt;/a&gt;. Replace this with the domain of your choice:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight nginx"&gt;&lt;code&gt;&lt;span class="k"&gt;server&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="kn"&gt;listen&lt;/span&gt; &lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
   &lt;span class="kn"&gt;listen&lt;/span&gt; &lt;span class="s"&gt;[::]:80&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

   &lt;span class="kn"&gt;server_name&lt;/span&gt;    &lt;span class="s"&gt;golanghelloworld.hackersandslackers.app&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

   &lt;span class="kn"&gt;location&lt;/span&gt; &lt;span class="n"&gt;/&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;X-Forwarded-For&lt;/span&gt; &lt;span class="nv"&gt;$proxy_add_x_forwarded_for&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;X-Forwarded-Proto&lt;/span&gt; &lt;span class="nv"&gt;$scheme&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;X-Real-IP&lt;/span&gt; &lt;span class="nv"&gt;$remote_addr&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;Host&lt;/span&gt; &lt;span class="nv"&gt;$http_host&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_pass&lt;/span&gt; &lt;span class="s"&gt;http://127.0.0.1:9100&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="kn"&gt;location&lt;/span&gt; &lt;span class="p"&gt;~&lt;/span&gt; &lt;span class="sr"&gt;/.well-known&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kn"&gt;allow&lt;/span&gt; &lt;span class="s"&gt;all&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
golang-helloworld.conf





&lt;p&gt;We activate this configuration by creating a sym link from our file in &lt;strong&gt;sites-available&lt;/strong&gt; to &lt;strong&gt;sites-enabled&lt;/strong&gt; :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo ln&lt;/span&gt; &lt;span class="nt"&gt;-s&lt;/span&gt; /etc/nginx/sites-available/golang-helloworld.conf /etc/nginx/sites-enabled/golang-helloworld.conf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Symlink Nginx config





&lt;p&gt;Finally, these changes are applied upon Nginx restart. If the below produces no output, you're in the clear:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;service nginx restart
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Restart Nginx





&lt;h3&gt;
  
  
  SSL With Certbot
&lt;/h3&gt;

&lt;p&gt;Adding SSL is arguably out of scope for what the point of this tutorial is, but whatever. &lt;a href="https://certbot.eff.org/lets-encrypt/ubuntubionic-nginx" rel="noopener noreferrer"&gt;Certbot&lt;/a&gt; makes adding SSL easy enough that we can blow through it in less than a minute.&lt;/p&gt;

&lt;p&gt;Before we can actually install Certbot, we need to add the proper repositories:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt-get update
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt-get &lt;span class="nb"&gt;install &lt;/span&gt;software-properties-common
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;add-apt-repository universe
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;add-apt-repository ppa:certbot/certbot
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt-get update
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Add Certbot PPA repository





&lt;p&gt;Now we can install Certbot for real:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt-get &lt;span class="nb"&gt;install &lt;/span&gt;certbot python3-certbot-nginx
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Install Certbot





&lt;p&gt;The Certbot CLI is able to accept an &lt;code&gt;--nginx&lt;/code&gt; flag, which scans the configurations we've set up on our machine:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;certbot &lt;span class="nt"&gt;--nginx&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Create certs based on your Nginx configs.





&lt;p&gt;This will list &lt;em&gt;every&lt;/em&gt; Nginx configuration you have on your machine and prompt you for which app you'd like to set up with SSL. My Ubuntu machine happens to host a bunch of sites. Feel free to check out any of them if you please :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Saving debug log to /var/log/letsencrypt/letsencrypt.log
Plugins selected: Authenticator nginx, Installer nginx

Which names would you like to activate HTTPS &lt;span class="k"&gt;for&lt;/span&gt;?
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
1: broiestbro.com
2: www.broiestbro.com
3: consider.pizza
4: stockholm.ghostthemes.io
5: hackersandslackers.app
6: hackersandslackers.tools
7: django.hackersandslackers.app
8: flaskblueprints.hackersandslackers.app
9: flasklogin.hackersandslackers.app
10: flasksession.hackersandslackers.app
11: flasksqlalchemy.hackersandslackers.app
12: flaskwtf.hackersandslackers.app
13: plotlydashflask.hackersandslackers.app
14: www.hackersandslackers.tools
15: hustlers.club
16: www.hustlers.club
17: toddbirchard.app
18: golanghelloworld.hackersandslackers.app
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Select the appropriate numbers separated by commas and/or spaces, or leave input
blank to &lt;span class="k"&gt;select &lt;/span&gt;all options shown &lt;span class="o"&gt;(&lt;/span&gt;Enter &lt;span class="s1"&gt;'c'&lt;/span&gt; to cancel&lt;span class="o"&gt;)&lt;/span&gt;: 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Certbot CLI





&lt;p&gt;The configuration I'm looking for is #18. Selecting this will then prompt whether or not we'd like to redirect HTTP traffic to HTTPS, which is something you should definitely do (is there even a reason &lt;em&gt;not&lt;/em&gt; to do this? Let me know in the COMMENTS BELOW and remember to smash that LIKE button).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Please choose whether or not to redirect HTTP traffic to HTTPS, removing HTTP access.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
1: No redirect - Make no further changes to the webserver configuration.
2: Redirect - Make all requests redirect to secure HTTPS access. Choose this &lt;span class="k"&gt;for
&lt;/span&gt;new sites, or &lt;span class="k"&gt;if &lt;/span&gt;you&lt;span class="s1"&gt;'re confident your site works on HTTPS. You can undo this
change by editing your web server'&lt;/span&gt;s configuration.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Select the appropriate number &lt;span class="o"&gt;[&lt;/span&gt;1-2] &lt;span class="k"&gt;then&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;enter] &lt;span class="o"&gt;(&lt;/span&gt;press &lt;span class="s1"&gt;'c'&lt;/span&gt; to cancel&lt;span class="o"&gt;)&lt;/span&gt;:
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Certbot CLI





&lt;p&gt;Select option &lt;strong&gt;2&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Check out how Certbot has modified our original &lt;strong&gt;golang-helloworld.conf&lt;/strong&gt; Nginx config:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight nginx"&gt;&lt;code&gt;&lt;span class="k"&gt;server&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

   &lt;span class="kn"&gt;server_name&lt;/span&gt;    &lt;span class="s"&gt;golanghelloworld.hackersandslackers.app&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

   &lt;span class="kn"&gt;location&lt;/span&gt; &lt;span class="n"&gt;/&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;X-Forwarded-For&lt;/span&gt; &lt;span class="nv"&gt;$proxy_add_x_forwarded_for&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;X-Forwarded-Proto&lt;/span&gt; &lt;span class="nv"&gt;$scheme&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;X-Real-IP&lt;/span&gt; &lt;span class="nv"&gt;$remote_addr&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;Host&lt;/span&gt; &lt;span class="nv"&gt;$http_host&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_pass&lt;/span&gt; &lt;span class="s"&gt;http://127.0.0.1:9100&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="kn"&gt;location&lt;/span&gt; &lt;span class="p"&gt;~&lt;/span&gt; &lt;span class="sr"&gt;/.well-known&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kn"&gt;allow&lt;/span&gt; &lt;span class="s"&gt;all&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="kn"&gt;listen&lt;/span&gt; &lt;span class="s"&gt;[::]:443&lt;/span&gt; &lt;span class="s"&gt;ssl&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kn"&gt;listen&lt;/span&gt; &lt;span class="mi"&gt;443&lt;/span&gt; &lt;span class="s"&gt;ssl&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kn"&gt;ssl_certificate&lt;/span&gt; &lt;span class="n"&gt;/etc/letsencrypt/live/golanghelloworld.hackersandslackers.app/fullchain.pem&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kn"&gt;ssl_certificate_key&lt;/span&gt; &lt;span class="n"&gt;/etc/letsencrypt/live/golanghelloworld.hackersandslackers.app/privkey.pem&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kn"&gt;include&lt;/span&gt; &lt;span class="n"&gt;/etc/letsencrypt/options-ssl-nginx.conf&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kn"&gt;ssl_dhparam&lt;/span&gt; &lt;span class="n"&gt;/etc/letsencrypt/ssl-dhparams.pem&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;server&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

    &lt;span class="kn"&gt;if&lt;/span&gt; &lt;span class="s"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$host&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;golanghelloworld.hackersandslackers.app)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kn"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;301&lt;/span&gt; &lt;span class="s"&gt;https://&lt;/span&gt;&lt;span class="nv"&gt;$host$request_uri&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

   &lt;span class="kn"&gt;listen&lt;/span&gt; &lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
   &lt;span class="kn"&gt;listen&lt;/span&gt; &lt;span class="s"&gt;[::]:80&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

   &lt;span class="kn"&gt;server_name&lt;/span&gt;    &lt;span class="s"&gt;golanghelloworld.hackersandslackers.app&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kn"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;404&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
golang-helloworld.conf





&lt;p&gt;That's what we like to see.&lt;/p&gt;

&lt;h2&gt;
  
  
  Create a Systemctl Service
&lt;/h2&gt;

&lt;p&gt;If you've never messed around with Systemctl services before, a "service" is something we want to run continuously on our server (for example, &lt;strong&gt;Nginx&lt;/strong&gt; is a service in itself). We're going to create a service for our Go app to make sure our app is always running, even if our server is restarted:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;vim /etc/systemd/system/golanghelloworld.service
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Add a service





&lt;p&gt;The syntax for Linux services follows the &lt;strong&gt;.ini&lt;/strong&gt; file format. There's a lot here which we'll dissect in a moment:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[Unit]&lt;/span&gt;
&lt;span class="py"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;golanghelloworld.hackersandslackers.app&lt;/span&gt;
&lt;span class="py"&gt;ConditionPathExists&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;/go/src/github.com/hackersandslackers/golang-helloworld&lt;/span&gt;
&lt;span class="py"&gt;After&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;network.target&lt;/span&gt;

&lt;span class="nn"&gt;[Service]&lt;/span&gt;
&lt;span class="py"&gt;Type&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;simple&lt;/span&gt;
&lt;span class="py"&gt;User&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;root&lt;/span&gt;
&lt;span class="py"&gt;Group&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;root&lt;/span&gt;

&lt;span class="py"&gt;WorkingDirectory&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;/go/src/github.com/hackersandslackers/golang-helloworld&lt;/span&gt;
&lt;span class="py"&gt;ExecStart&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;/go/src/github.com/hackersandslackers/golang-helloworld/golang-helloworld&lt;/span&gt;

&lt;span class="py"&gt;Restart&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;on-failure&lt;/span&gt;
&lt;span class="py"&gt;RestartSec&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;10&lt;/span&gt;

&lt;span class="py"&gt;ExecStartPre&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;/bin/mkdir -p /var/log/golang-helloworld&lt;/span&gt;
&lt;span class="py"&gt;ExecStartPre&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;/bin/chown syslog:adm /var/log/golang-helloworld&lt;/span&gt;
&lt;span class="py"&gt;ExecStartPre&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;/bin/chmod 775 /go/src/github.com/hackersandslackers/golang-helloworld/golang-helloworld&lt;/span&gt;

&lt;span class="nn"&gt;[Install]&lt;/span&gt;
&lt;span class="py"&gt;WantedBy&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;multi-user.target&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
golanghelloworld.service





&lt;p&gt;Here are the notable values being set above:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;User&lt;/code&gt; &amp;amp; &lt;code&gt;Group&lt;/code&gt;: Probably not the best idea in the word, but this tells our service to run our app as the Ubuntu root user. Feel free to change this to a different Linux user.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;WorkingDirectory&lt;/code&gt;: The working directory that we'll be serving our app from.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ExecStart&lt;/code&gt;: The compiled binary file of our Go project.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Restart&lt;/code&gt; &amp;amp; &lt;code&gt;RestartSec&lt;/code&gt;: These values are exceptionally useful for ensuring that our app is &lt;em&gt;always&lt;/em&gt; up and available, even after crashing from unforeseen circumstances. These two values are telling our service to check if our app is running every 10 seconds. If the app happens to be down, our service will restart the app, hence the &lt;strong&gt;on-failure&lt;/strong&gt; value for &lt;strong&gt;Restart&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ExecStartPre&lt;/code&gt;: Each of these lines contains a command to run prior to starting our app. We set three such commands: the first two ensure that our app logs correctly to a directory called &lt;strong&gt;/var/log/golang-helloworld&lt;/strong&gt;. The third (and more important) command sets permissions on our Go binary file.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Save your service file. Below we register the changes we've made to our system's services, start our new service, and &lt;strong&gt;enable&lt;/strong&gt; our service to be run upon system startup:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;systemctl daemon-reload
&lt;span class="nv"&gt;$ &lt;/span&gt;service golanghelloworld start
&lt;span class="nv"&gt;$ &lt;/span&gt;service golanghelloworld &lt;span class="nb"&gt;enable&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Start service





&lt;p&gt;Let's check to see how things went:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;service golanghelloworld status
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Check service status





&lt;p&gt;If you happen to be very lucky, you'll see a SUCCESS output like the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;golanghelloworld.service - golanghelloworld.hackersandslackers.app
   Loaded: loaded &lt;span class="o"&gt;(&lt;/span&gt;/etc/systemd/system/golanghelloworld.service&lt;span class="p"&gt;;&lt;/span&gt; disabled&lt;span class="p"&gt;;&lt;/span&gt; vendor preset: enabled&lt;span class="o"&gt;)&lt;/span&gt;
   Active: active &lt;span class="o"&gt;(&lt;/span&gt;running&lt;span class="o"&gt;)&lt;/span&gt; since Fri 2020-05-29 01:57:02 UTC&lt;span class="p"&gt;;&lt;/span&gt; 4s ago
  Process: 21454 &lt;span class="nv"&gt;ExecStartPre&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/bin/chmod 775 /go/src/github.com/hackersandslackers/golang-helloworld/golang-helloworld &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;exited, &lt;span class="nv"&gt;status&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0/SUCCESS&lt;span class="o"&gt;)&lt;/span&gt;
  Process: 21449 &lt;span class="nv"&gt;ExecStartPre&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/bin/chmod 775 /var/log/golang-helloworld &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;exited, &lt;span class="nv"&gt;status&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0/SUCCESS&lt;span class="o"&gt;)&lt;/span&gt;
  Process: 21445 &lt;span class="nv"&gt;ExecStartPre&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/bin/chown syslog:adm /var/log/golang-helloworld &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;exited, &lt;span class="nv"&gt;status&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0/SUCCESS&lt;span class="o"&gt;)&lt;/span&gt;
  Process: 21444 &lt;span class="nv"&gt;ExecStartPre&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/bin/mkdir &lt;span class="nt"&gt;-p&lt;/span&gt; /var/log/golang-helloworld &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;exited, &lt;span class="nv"&gt;status&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0/SUCCESS&lt;span class="o"&gt;)&lt;/span&gt;
 Main PID: 21455 &lt;span class="o"&gt;(&lt;/span&gt;golang-hellowor&lt;span class="o"&gt;)&lt;/span&gt;
    Tasks: 6 &lt;span class="o"&gt;(&lt;/span&gt;limit: 4915&lt;span class="o"&gt;)&lt;/span&gt;
   CGroup: /system.slice/golanghelloworld.service
           └─21455 /go/src/github.com/hackersandslackers/golang-helloworld/golang-helloworld
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Service status





&lt;h3&gt;
  
  
  Debugging Systemctl Services
&lt;/h3&gt;

&lt;p&gt;The unfortunate truth about creating Linux services is that there are a lot of moving parts at play; I don't think I've ever gotten a new service to work on the first attempt without some sort of error. These errors range from permission errors, incorrect file paths, or port clashing. The good news is that each of these problems are individually simple to fix.&lt;/p&gt;

&lt;p&gt;Your first line of defense to debugging services is &lt;code&gt;journalctl&lt;/code&gt; to check the log output of any service:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;journalctl &lt;span class="nt"&gt;-u&lt;/span&gt; golanghelloworld.service
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Check logs for a service





&lt;p&gt;An indispensable tool for debugging issues is the ability to &lt;code&gt;grep&lt;/code&gt; for processes to see if they're running properly. We can search active processes by name or by port. If &lt;code&gt;journalctl&lt;/code&gt; outputs an error that port &lt;strong&gt;9100&lt;/strong&gt; is in use, you can find that process via the below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;ps aux | &lt;span class="nb"&gt;grep &lt;/span&gt;9100
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Search for process by port number





&lt;p&gt;We can kill whichever process comes back with &lt;code&gt;kill -9 [PID]&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Alternatively, we can check if our app is running by grepping by process name:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;ps aux | &lt;span class="nb"&gt;grep &lt;/span&gt;golang-helloworld
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
Search for process by name





&lt;p&gt;If needed, we could kill the process with &lt;code&gt;pkill -9 golang-helloworld&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;If you need to make changes to your &lt;strong&gt;.service&lt;/strong&gt; file, remember to run &lt;code&gt;systemctl daemon-reload&lt;/code&gt; to pick up the changes, and &lt;code&gt;service golanghelloworld restart&lt;/code&gt; to give it another go.&lt;/p&gt;

&lt;h2&gt;
  
  
  Get Out There
&lt;/h2&gt;

&lt;p&gt;I have faith you'll work out the kinks and successfully get your Go app up and running. It took me a bit of time myself, but my shitty &lt;strong&gt;hello world&lt;/strong&gt; is up and live in all its glory &lt;a href="http://golanghelloworld.hackersandslackers.app/" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you run into issues that you can't seem to solve, feel free to reach out. We'll work it out together.&lt;/p&gt;

</description>
      <category>go</category>
      <category>devops</category>
      <category>softwaredevelopment</category>
    </item>
    <item>
      <title>Create Your First Golang App</title>
      <dc:creator>Todd Birchard</dc:creator>
      <pubDate>Tue, 26 May 2020 05:50:00 +0000</pubDate>
      <link>https://dev.to/hackersandslackers/create-your-first-golang-app-3cdm</link>
      <guid>https://dev.to/hackersandslackers/create-your-first-golang-app-3cdm</guid>
      <description>&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--v-wKzqzI--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2020/05/golang-gettingstarted.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--v-wKzqzI--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2020/05/golang-gettingstarted.jpg" alt="Create Your First Golang App"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To be human is to be an unwilling passenger in a winding, aimless journey we call life. Each of us has felt the eternal solidarity of time break apart as we are thrust into existence to navigate the tribulations of existing, left only to wonder what the point of it all is. Just as we become complacent in our respective existential struggles, an event of unspeakable force shakes the foundations of our reality: we fall in love.&lt;/p&gt;

&lt;p&gt;Falling in love is as exhausting as it is enchanting. Our lifespans only have the willing capacity to fall in love a finite number of times, if at all. This realization is responsible for making us wary of falling in love in the first place, as well as zealously defending the love we've found against potential intruders. I'm one to advocate for an opposite conclusion. As difficult, scary, or time-consuming as love may be, I argue that a life that has discovered love on multiple occasions is a life well-lived. If that makes me a &lt;em&gt;slut&lt;/em&gt;, so be it: I am a slut for programming languages.&lt;/p&gt;

&lt;p&gt;I've proudly maintained a long-term loving commitment to Python, as well as a complicated affair with JavaScript. Still, even this promiscuous lifestyle can leave one yearning for a kind of love that only a statically-typed language can deliver. As our relationships transition from scripting honeymoons to mature enterprise-level endeavors, it's natural to second-guess our choices. Why endure the overhead of a dynamically typed language when we end up annotating types with &lt;a href="http://mypy-lang.org/"&gt;MyPy&lt;/a&gt;? Could there be validity in arguments that claim our language of choice is "slow?" And do we really think we can put up with baggage known as the &lt;a href="https://wiki.python.org/moin/GlobalInterpreterLock"&gt;GIL&lt;/a&gt; till death do us part?&lt;/p&gt;

&lt;p&gt;If this all seems like a roundabout way to announce that I've been fooling around with Golang, that's because it absolutely is. I don't suspect that many people are willing or capable of leaving their comfort zones to partake in a journey of this magnitude. For the rest of you, I'd like to welcome you by my side for a short moment in our lives to explore the unexplored. Who knows, perhaps you'll even find love along the way.&lt;/p&gt;

&lt;h2&gt;
  
  
  Installation and Setup
&lt;/h2&gt;

&lt;p&gt;Installing on OSX is simple thanks to Homebrew:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;brew update
&lt;span class="nv"&gt;$ &lt;/span&gt;brew &lt;span class="nb"&gt;install &lt;/span&gt;golang
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Install Golang via Homebrew





&lt;h3&gt;
  
  
  GOPATH vs GOROOT
&lt;/h3&gt;

&lt;p&gt;Installing Golang via Homebrew automatically generates two directories critical to running Go:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GOROOT&lt;/strong&gt; ( &lt;code&gt;/usr/local/go&lt;/code&gt; ): The Go "root" directory contains Go's source code. Homebrew will automatically register this path for you; there's little reason to mess around in here unless you're a Go contributor or if you're attempting to run multiple versions of Go.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GOPATH&lt;/strong&gt; ( &lt;code&gt;/Users/toddbirchard/go&lt;/code&gt; ): Unlike most programming languages, Go takes an opinionated stance that all projects and dependencies of the language should exist in a single directory known as the &lt;em&gt;GOPATH&lt;/em&gt;. Any time we develop a Go project or install a third-party module, the actions taken ultimately happen inside this directory.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To make sure OSX recognizes our   &lt;strong&gt;GOPATH&lt;/strong&gt; , we'll have to add it to our shell's startup script. Open your &lt;strong&gt;.bashrc&lt;/strong&gt; , &lt;strong&gt;.zshrc&lt;/strong&gt; , or whatever it is you use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;vim ~/.zshrc
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Modify shell script





&lt;p&gt;We're going to add the &lt;strong&gt;/go&lt;/strong&gt; base directory, as well as the subdirectory &lt;strong&gt;/go/bin&lt;/strong&gt; :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;GOPATH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/Users/toddbirchard/go
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;PATH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$PATH&lt;/span&gt;:&lt;span class="nv"&gt;$GOPATH&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;PATH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$PATH&lt;/span&gt;:&lt;span class="nv"&gt;$GOPATH&lt;/span&gt;/bin
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Add GOPATH to your PATH





&lt;p&gt;Save this and reload your shell script:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;.&lt;/span&gt; ~/.zshrc
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Active changes





&lt;p&gt;Let's make sure everything went as planned:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;go version
&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; go version go1.14.2 darwin/amd64
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Verify Installation





&lt;p&gt;And as a last bit of due diligence, let's confirm that our &lt;strong&gt;GOPATH&lt;/strong&gt; is being recognized correctly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;go &lt;span class="nb"&gt;env &lt;/span&gt;GOPATH
&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; /Users/toddbirchard/go
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Verify GOPATH





&lt;p&gt;As an aside, &lt;code&gt;go&lt;/code&gt; is Golang's CLI which is essential for compiling, formatting, and running Go code, as well as installing &lt;em&gt;modules&lt;/em&gt; (we'll get to those in a sec). Try running &lt;code&gt;go help&lt;/code&gt; to get acquainted.&lt;/p&gt;

&lt;h2&gt;
  
  
  Anatomy of GOPATH
&lt;/h2&gt;

&lt;p&gt;Golang's &lt;strong&gt;GOPATH&lt;/strong&gt; is a directory where all your Go code and project dependencies live. The &lt;code&gt;go&lt;/code&gt; CLI actually has a built-in command &lt;code&gt;$ go help gopath&lt;/code&gt; which explains this quite well:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The Go path is used to resolve import statements.&lt;br&gt;&lt;br&gt;
It is implemented by and documented in the go/build package.&lt;br&gt;&lt;br&gt;
The GOPATH environment variable lists places to look for Go code.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Similar to how Python looks for imported libraries in the Python path, Go searches the &lt;strong&gt;GOPATH&lt;/strong&gt; for the same. A notable difference between Python and Go paths is that Go expects all your Go projects to live within the GOPATH, specifically &lt;strong&gt;/go/src&lt;/strong&gt;. Contrast this with Python where projects can live anywhere.&lt;/p&gt;

&lt;p&gt;The GOPATH directory is made up of 3 subdirectories:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;/go
├── /bin
├── /pkg
└── /src
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Structure of our GOPATH.





&lt;p&gt;&lt;code&gt;$ go help gopath&lt;/code&gt; explains the purpose of each of these directories.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;src:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The src directory holds source code. The path below src determines the import path or executable name.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;pkg:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The pkg directory holds installed package objects.&lt;br&gt;&lt;br&gt;
As in the Go tree, each target operating system and architecture pair has its own subdirectory of pkg (pkg/GOOS_GOARCH).&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;bin:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The bin directory holds compiled commands.&lt;br&gt;&lt;br&gt;
Each command is named for its source directory, but only the final element, not the entire path.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In short, your personal source code belongs in &lt;strong&gt;/src&lt;/strong&gt; , installed third-party packages will live in &lt;strong&gt;/pkg&lt;/strong&gt; , and third-party commands which extend the &lt;code&gt;go&lt;/code&gt; CLI will live in &lt;strong&gt;/bin&lt;/strong&gt;. To give an example, here's what my path looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;/go
├── /bin
│   ├── golint
│   └── tour
├── /pkg
│   ├── /darwin_amd64
│   │   ├── github.com
│   │   ├── go-pandas.a
│   │   ├── golangwebsite
│   │   └── hustlers
│   ├── /mod
│   │   ├── /cache
│   │   ├── /cloud.google.com
│   │   │   └── go@v0.57.0
│   │   ├── /github.com
│   │   │   ├── /google
│   │   │   │   └── go-cmp@v0.4.0
│   │   │   ├── /gorilla
│   │   │   │   └── mux@v1.7.4
│   │   │   ├── /mattn
│   │   │   │   └── go-runewidth@v0.0.7
│   │   │   ├── /olekukonko
│   │   │   │   └── tablewriter@v0.0.4
│   │   │   └── /rocketlaunchr
│   │   │       └── dataframe-go@v0.0.0-20200520082355-50e589cfde42
│   │   └── golang.org
│   └── /sumdb
│       └── sum.golang.org
└── /src
    ├── /golang-helloworld
    │   ├── README.md
    │   ├── go.mod
    │   ├── go.sum
    │   ├── golang-helloworld
    │   └── main.go
    ├── /golang.org
    │   └── x
    └── /hustlers
        ├── README.md
        ├── go.mod
        ├── go.sum
        ├── hustlers
        ├── main.go
        ├── main_test.go
        ├── static
        ├── templates
        └── vendor
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
My GOPATH.





&lt;p&gt;We should be able to break this down quite easily.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;/bin&lt;/strong&gt; contains two Go commands I installed previously. &lt;a href="https://github.com/golang/lint"&gt;golint&lt;/a&gt; is a third-party linter for Go, and "tour" is a local version of the official &lt;a href="https://tour.golang.org/welcome/1"&gt;Tour of Go&lt;/a&gt; walkthrough to help Go newcomers learn their way around the language (I highly recommend completing this, btw).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;/pkg&lt;/strong&gt; contains the packages I've installed. Pay special attention to &lt;em&gt;/pkg/mod&lt;/em&gt;, where you can see I've installed several packages from Github under the_/github.com_ directory. These packages include &lt;a href="https://github.com/rocketlaunchr/dataframe-go"&gt;dataframe-go&lt;/a&gt;, which is a Go implementation of Pandas-like DataFrames, as well as &lt;a href="https://github.com/gorilla/mux"&gt;mux&lt;/a&gt;, which is an HTTP router that we're going to use in our first hello-world project.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;/src&lt;/strong&gt; has three projects I've worked on already. &lt;em&gt;golang-helloworld&lt;/em&gt; is the project we're about to create in this tutorial.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Golang Terminology
&lt;/h2&gt;

&lt;p&gt;Before we get to coding, let's brush up on some basic Go vocabulary:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Packages&lt;/strong&gt;: Go programs are made up of "packages," which mirror packaging concepts in other programming languages (think &lt;em&gt;modules&lt;/em&gt; in Python or &lt;em&gt;packages&lt;/em&gt; in Java). Every Golang program contains a package called &lt;strong&gt;main&lt;/strong&gt;, which serves as the project's entry point.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Modules&lt;/strong&gt;: Go modules are third-party libraries installed by Go. Modules are essentially projects which have been published for general use as dependencies in your projects.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vendors&lt;/strong&gt;: This is where things get interesting. While modules can be installed to the &lt;em&gt;/pkg/mod&lt;/em&gt; directory for global use, source projects can contain their &lt;em&gt;own&lt;/em&gt; versions of these modules to avoid clashing dependency versions between projects (this is not dissimilar to Python virtual environments). While not required, you can choose to keep module versions project-specific (we will do this in our example).&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Creating a Hello World App
&lt;/h2&gt;

&lt;p&gt;Enough chit-chat, let's make our first Go project. We start with creating our project's directory in the &lt;strong&gt;/go/src&lt;/strong&gt; directory:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; &lt;span class="nv"&gt;$GOPATH&lt;/span&gt;/src
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;mkdir &lt;/span&gt;golang-helloworld
&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;golang-helloworld
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;While inside our new project directory, we're now going to initialize our project as a Go &lt;em&gt;module.&lt;/em&gt; This means anybody will be able to install our Go code off Github if they so choose. I know I'm going to save my repo to &lt;a href="https://hackersandslackers.app/p/eecb7ecb-860f-4acd-a943-32a0350c6017/github.com/hackersandslackers/golang-helloworld"&gt;github.com/hackersandslackers/golang-helloworld&lt;/a&gt;, so we run the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;go mod init github.com/hackersandslackers/golang-helloworld
&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; go: creating new go.mod: module github.com/hackersandslackers/golang-helloworld
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Creating a Go module.





&lt;p&gt;The moment this is done, a new file will appear in your directory called &lt;strong&gt;go.mod&lt;/strong&gt;. Check out the contents using &lt;code&gt;$ cat go.mod&lt;/code&gt; to see what this initializes with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;module github.com/hackersandslackers/golang-helloworld

go 1.14
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
go.mod





&lt;p&gt;Pretty simple stuff so far! &lt;strong&gt;go.mod&lt;/strong&gt; contains information about our module for others, such as the module name and Go version it is intended for. As we install dependencies for our project, these dependencies and their respective versions will be stored here.&lt;/p&gt;

&lt;h3&gt;
  
  
  main.go
&lt;/h3&gt;

&lt;p&gt;As mentioned, every Go project's entry point is a file called &lt;strong&gt;main.go&lt;/strong&gt;. We're going to create the most simple &lt;strong&gt;main.go&lt;/strong&gt; file imaginable: a script which outputs &lt;code&gt;"Hello, world."&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="s"&gt;"fmt"&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Hello, world."&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
main.go





&lt;p&gt;Now remember: since Go is a compiled language, we need to &lt;em&gt;build&lt;/em&gt; our project before we can run it. I know the extra effort is nearly unbearable, but things are about to pay off as you witness the fruits of your labor:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;go build
&lt;span class="nv"&gt;$ &lt;/span&gt;go run main.go
&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; Hello, world.
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Build and run your module.





&lt;p&gt;&lt;em&gt;WE DID IT!&lt;/em&gt; We've just created our first "hello world" app in Go. Your project structure should now look something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;/golang-helloworld
├── go.mod
├── golang-helloworld
└── main.go
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Our project so far.





&lt;p&gt;The newly created &lt;strong&gt;golang-helloworld&lt;/strong&gt; file is the compiled executable which is created each time we run &lt;code&gt;$ go build&lt;/code&gt;. Each time we make changes to our source code, we should run &lt;code&gt;$ go build&lt;/code&gt; again to rebuild this executable with our changes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bonus: Code Formatting
&lt;/h3&gt;

&lt;p&gt;A nifty tool that comes out-of-the-box in Go is a code formatter to clean up any ugly indents and such in your source. Try messing up the indents in &lt;strong&gt;main.go&lt;/strong&gt; and run the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;go &lt;span class="nb"&gt;fmt&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; main.go
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Formatting source code.





&lt;p&gt;This should fix all the ugly formatting in the file names in outputs, in our case &lt;strong&gt;main.go&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Create a Web App
&lt;/h2&gt;

&lt;p&gt;If we were to leave off with a stupid program that prints &lt;code&gt;"Hello, world!"&lt;/code&gt;, I'd be doing you a disservice. While we've set up Golang successfully, we haven't learned much about creating anything useful yet. It's time for us to kick things up a notch by making our app a web app which can be served from a browser.&lt;/p&gt;

&lt;h3&gt;
  
  
  Installing our First Dependency
&lt;/h3&gt;

&lt;p&gt;To serve Go code via a web server, we're going to leverage the highly popular &lt;a href="https://github.com/gorilla/mux"&gt;gorilla/mux&lt;/a&gt; module: a lightweight request router and dispatcher for matching incoming requests to their respective handler.&lt;/p&gt;

&lt;p&gt;We're going to install this by running &lt;code&gt;$ go get&lt;/code&gt; followed by &lt;code&gt;$ go install&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;go get &lt;span class="nt"&gt;-u&lt;/span&gt; github.com/gorilla/mux
&lt;span class="nv"&gt;$ &lt;/span&gt;go &lt;span class="nb"&gt;install &lt;/span&gt;github.com/gorilla/mux 
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Install a dependency.





&lt;p&gt;&lt;code&gt;go get&lt;/code&gt; installs the source for &lt;strong&gt;gorilla/mux&lt;/strong&gt; to our &lt;strong&gt;/go/bin&lt;/strong&gt; directory. The &lt;code&gt;-u&lt;/code&gt; flag we pass is an "update" flag, which we use to grab the latest version just in case.&lt;/p&gt;

&lt;p&gt;Let's see how &lt;strong&gt;go.mod&lt;/strong&gt; was affected by running &lt;code&gt;$ cat go.mod&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;module github.com/hackersandslackers/golang-helloworld

go 1.14

require github.com/gorilla/mux v1.7.4
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
go.mod





&lt;p&gt;As promised, our module dependency has now been added to &lt;strong&gt;go.mod&lt;/strong&gt; along with the proper version number. Now we can import and use &lt;code&gt;"github.com/gorilla/mux"&lt;/code&gt; to help us build a project!&lt;/p&gt;

&lt;p&gt;We can also use &lt;code&gt;$ go mod vendor&lt;/code&gt; to build this dependency in our &lt;strong&gt;/vendors&lt;/strong&gt; folder to keep it local to our project.&lt;/p&gt;

&lt;p&gt;Here's a wall of code which turns our &lt;strong&gt;hello world&lt;/strong&gt; app into a web app:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/gorilla/mux"&lt;/span&gt;
    &lt;span class="s"&gt;"log"&lt;/span&gt;
    &lt;span class="s"&gt;"net/http"&lt;/span&gt;
    &lt;span class="s"&gt;"time"&lt;/span&gt;
    &lt;span class="s"&gt;"io"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ResponseWriter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;io&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WriteString&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Hello, world!&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// Route declaration&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;router&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;mux&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Router&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;mux&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewRouter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HandleFunc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// Initiate web server&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;router&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;router&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;srv&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Server&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;Handler&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;router&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;Addr&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;    &lt;span class="s"&gt;"127.0.0.1:9100"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;WriteTimeout&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="m"&gt;15&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Second&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;ReadTimeout&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;  &lt;span class="m"&gt;15&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Second&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;srv&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ListenAndServe&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
main.go





&lt;p&gt;Our functions are &lt;code&gt;main()&lt;/code&gt;, &lt;code&gt;router()&lt;/code&gt;, and &lt;code&gt;handler()&lt;/code&gt;, which get executed in that order.&lt;/p&gt;

&lt;h3&gt;
  
  
  main()
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;main()&lt;/code&gt; sets up an HTTP server to be served locally on port &lt;strong&gt;9100&lt;/strong&gt; , with a couple read &amp;amp; write timeouts set as a form of best practice. Our server doesn't do much on its own without any routes to resolve. That's where our &lt;code&gt;router()&lt;/code&gt; function comes in.&lt;/p&gt;

&lt;h3&gt;
  
  
  router()
&lt;/h3&gt;

&lt;p&gt;We initialize a "router" by created variable &lt;strong&gt;r&lt;/strong&gt; with &lt;code&gt;r := mux.NewRouter()&lt;/code&gt;. From there we can set as many routes as we'd like with the following syntax:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HandleFunc&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;URL_ROUTE&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;FUNCTION_TO_EXECUTE&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Setting a route with mux.





&lt;p&gt;&lt;code&gt;HandleFunc()&lt;/code&gt; is a built-in method to resolve URL routes. The first parameter is the target URL, and the second is the name of a function to be executed when a user requests said route. We only specify a single route in our example, but we could theoretically set as many as we'd like, for example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;mux&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewRouter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HandleFunc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;homeHandler&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HandleFunc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/about"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;aboutHandler&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HandleFunc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/contact"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;contactHandler&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Example of setting multiple routes.





&lt;h3&gt;
  
  
  handler()
&lt;/h3&gt;

&lt;p&gt;Mux handler functions always accept two parameters by default, which essentially resolve to &lt;strong&gt;output&lt;/strong&gt; and &lt;strong&gt;input&lt;/strong&gt;. &lt;code&gt;w http.ResponseWriter&lt;/code&gt; expects a parameter named &lt;strong&gt;w&lt;/strong&gt; with the type &lt;strong&gt;http.ResponseWriter&lt;/strong&gt;, which is what we return to render something for the end-user. &lt;code&gt;r *http.Request&lt;/code&gt; contains information about the user's request, saved to a parameter named &lt;strong&gt;r&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;We're keeping things simple(ish) today, so we'll settle for our route to simple output a "hello world" message for our route:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;io&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WriteString&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Hello, world!&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Output a string.





&lt;p&gt;Rebuild and run our project with &lt;code&gt;$ go build&lt;/code&gt; and &lt;code&gt;$ go run main.go&lt;/code&gt;. Now try visiting &lt;strong&gt;127.0.0.1:9100&lt;/strong&gt; in your browser:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--jMf1PK8C--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2020/05/golang-helloworld.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--jMf1PK8C--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://hackersandslackers-cdn.storage.googleapis.com/2020/05/golang-helloworld.jpg" alt="Create Your First Golang App"&gt;&lt;/a&gt;Our live Go app.&lt;/p&gt;

&lt;p&gt;And there you have it, lovebirds.&lt;/p&gt;

&lt;h2&gt;
  
  
  Best Practices
&lt;/h2&gt;

&lt;p&gt;Before I leave to give you and your new favorite Gopher some private time, there's some very low-hanging fruit worth picking in an intro tutorial. This won't last long.&lt;/p&gt;

&lt;h3&gt;
  
  
  "Exported" Functions (AKA: Public Versus Private)
&lt;/h3&gt;

&lt;p&gt;Nearly every programming language has the concept of "private" versus "public" functions. Go has this concept as well regarding shared functions between packages. Functions that are "shared" are referred to as "exported functions" (sup JavaScript).&lt;/p&gt;

&lt;p&gt;A name is exported if it begins with a capital letter. Our &lt;strong&gt;hello world&lt;/strong&gt; example consisted solely of private functions (which makes sense, as we only had a single package). If we wanted to make our &lt;code&gt;router()&lt;/code&gt; function accessible by other packages, we'd simply need to rename this to &lt;code&gt;Router()&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Type Declaration
&lt;/h3&gt;

&lt;p&gt;Go expects that &lt;strong&gt;variables&lt;/strong&gt; , incoming function &lt;strong&gt;parameters&lt;/strong&gt; , and &lt;strong&gt;function return values&lt;/strong&gt; to have declared types. In the below example, the function &lt;code&gt;add()&lt;/code&gt; accepts two integers and adds them, which undoubtedly results in an integer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Setting types for incoming function parameters.





&lt;p&gt;Variables are set using the same syntax as function parameters, with the variable name coming first:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Setting a single integer variable





&lt;p&gt;There's also a shorthand way of setting multiple variables of the same type by separating variable names by commas. In this case, variables &lt;code&gt;x&lt;/code&gt;, &lt;code&gt;y&lt;/code&gt;, and &lt;code&gt;z&lt;/code&gt; are set as integers with no assigned values:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Creating 3 variables each sharing the &lt;code&gt;int&lt;/code&gt; type.





&lt;h3&gt;
  
  
  Short Assignment Statements
&lt;/h3&gt;

&lt;p&gt;A very cool feature of Go is the &lt;code&gt;:=&lt;/code&gt; operator. The "short assignment" operator can be used to set multiple variables at once &lt;em&gt;with implicit types&lt;/em&gt;. That means Go will resolve the type of each variable on its own based on the value assigned without the need for explicit type declaration. The below example creates three variables, where &lt;code&gt;x&lt;/code&gt; and &lt;code&gt;y&lt;/code&gt; are resolved as booleans, and &lt;code&gt;z&lt;/code&gt; is resolved as a string:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="no"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"no!"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
Implicitly set variable types with the &lt;code&gt;:=&lt;/code&gt; operator.





&lt;h3&gt;
  
  
  Constants
&lt;/h3&gt;

&lt;p&gt;The last noteworthy nugget is the presence of &lt;strong&gt;constants&lt;/strong&gt; in Go. While there's nothing unique about Go supporting constants, it's a breath of fresh air for Pythonistas who may be nostalgic about having the ability to do the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;Website&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"hackersandslackers.com"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h2&gt;
  
  
  Happily Ever After?
&lt;/h2&gt;

&lt;p&gt;Whether or not you hop on the Go train is a question of where your heart lies. While I'll continue using Python for the majority of what I do, it's nice to leave the Mrs. at home once in a while &lt;em&gt;(inb4 this misogynistic analogy ruins me)&lt;/em&gt; to fool around building quick endpoints in a statically-typed language &lt;em&gt;which isn't Java&lt;/em&gt;. Have I ever mentioned how much &lt;a href="https://twitter.com/ToddRBirchard/status/1088315897847209985"&gt;I hate Oracle&lt;/a&gt;? Like, how much I &lt;a href="https://twitter.com/ToddRBirchard/status/1088682663282642946"&gt;&lt;em&gt;really&lt;/em&gt; hate them&lt;/a&gt;? No? Perhaps another time.&lt;/p&gt;

&lt;p&gt;Anyway, get on with it then. The repository for what we've created today is up on Github here:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/hackersandslackers/golang-helloworld"&gt;https://github.com/hackersandslackers/golang-helloworld&lt;/a&gt;&lt;/p&gt;

</description>
      <category>go</category>
      <category>softwaredevelopment</category>
    </item>
  </channel>
</rss>
