<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Grunet</title>
    <description>The latest articles on DEV Community by Grunet (@grunet).</description>
    <link>https://dev.to/grunet</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F841064%2Ff6a1df07-6795-42f3-99eb-9df786d82249.jpeg</url>
      <title>DEV Community: Grunet</title>
      <link>https://dev.to/grunet</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/grunet"/>
    <language>en</language>
    <item>
      <title>Making a Totally Free Uptime Monitor using a Worker Runtime and OpenTelemetry</title>
      <dc:creator>Grunet</dc:creator>
      <pubDate>Mon, 03 Jun 2024 17:09:33 +0000</pubDate>
      <link>https://dev.to/grunet/making-a-totally-free-uptime-monitor-using-a-worker-runtime-and-opentelemetry-1bha</link>
      <guid>https://dev.to/grunet/making-a-totally-free-uptime-monitor-using-a-worker-runtime-and-opentelemetry-1bha</guid>
      <description>&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;What is an Uptime Monitor and When to Use One?&lt;/li&gt;
&lt;li&gt;Traditional Options&lt;/li&gt;
&lt;li&gt;
Using a Worker Runtime and OpenTelemetry

&lt;ul&gt;
&lt;li&gt;The High-Level Solution&lt;/li&gt;
&lt;li&gt;The High-Level Setup Steps&lt;/li&gt;
&lt;li&gt;Comparison to the Other Options&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;Takeaway&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What is an Uptime Monitor and When to Use One?
&lt;/h2&gt;

&lt;p&gt;An uptime monitor is a tool that periodically (e.g. every minute) checks your application or API to gauge if it’s up and healthy.&lt;/p&gt;

&lt;p&gt;If you have true observability and are using SLOs effectively you probably don’t need to use one. But if you’re not at that level yet, an uptime monitor can be a valuable information source regarding the reliability of your application or API.&lt;/p&gt;

&lt;h2&gt;
  
  
  Traditional Options
&lt;/h2&gt;

&lt;p&gt;There are a number of ways to run an uptime monitor. For example,&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Running a cron job on a server/VM and using bash, curl, and webhooks&lt;/li&gt;
&lt;li&gt;Setting up an Eventbridge cron with Container/Lambda targets and webhooks&lt;/li&gt;
&lt;li&gt;Paying for a 3rd party service (e.g. Pingdom)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each of them comes with their own downsides though&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Maintenance (e.g. security patching, keeping away from end-of-life states)&lt;/li&gt;
&lt;li&gt;Complexity (e.g. setting up IaC, CI/CD)&lt;/li&gt;
&lt;li&gt;Cost&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Is there an option that avoids these downsides?&lt;/p&gt;

&lt;h2&gt;
  
  
  Using a Worker Runtime and OpenTelemetry
&lt;/h2&gt;

&lt;p&gt;I contend there is using a &lt;a href="https://workers.js.org/"&gt;worker runtime&lt;/a&gt; and &lt;a href="https://opentelemetry.io/"&gt;OpenTelemetry&lt;/a&gt;.  &lt;/p&gt;

&lt;h3&gt;
  
  
  The High-Level Solution
&lt;/h3&gt;

&lt;p&gt;The solution maps out at a high-level as follows&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Use a cron from a worker runtime&lt;/li&gt;
&lt;li&gt;Have the worker hit the application or API endpoint&lt;/li&gt;
&lt;li&gt;Gather instrumentation about the network call with OpenTelemetry&lt;/li&gt;
&lt;li&gt;Send that OpenTelemetry instrumentation to an observability backend&lt;/li&gt;
&lt;li&gt;Use the observability backend to alert on unhealthy traffic &lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  The High-Level Setup Steps
&lt;/h3&gt;

&lt;p&gt;These steps will use Cloudflare Workers for the worker runtime, but something similar can be done with Deno Deploy as well.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;a href="https://dash.cloudflare.com/sign-up/workers-and-pages"&gt;Create a free Cloudflare account&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href="https://developers.cloudflare.com/workers/get-started/guide/"&gt;Create a worker&lt;/a&gt; with the following code and the &lt;a href="https://developers.cloudflare.com/workers/runtime-apis/nodejs/#enable-nodejs-from-the-cloudflare-dashboard"&gt;Node.js compatibility flag&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;instrument&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@microlabs/otel-cf-workers&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;handler&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; 
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;scheduled&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ENDPOINT_TO_MONITOR&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;_trigger&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;exporter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;https://api.honeycomb.io/v1/traces&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;x-honeycomb-team&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;HONEYCOMB_API_KEY&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="na"&gt;service&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ENDPOINT_NAME&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="nf"&gt;instrument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;config&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://developers.cloudflare.com/workers/configuration/environment-variables/#add-environment-variables-via-the-dashboard"&gt;Add an environment variable&lt;/a&gt; named “ENDPOINT_TO_MONITOR” with the endpoint to check and add another environment variable named “ENDPOINT_NAME” with a friendly name for the endpoint&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://www.notion.so/Making-a-Totally-Free-Uptime-Monitor-using-a-Worker-Runtime-and-OpenTelemetry-0a4636936b3c40f38dd8c4a474145aec?pvs=21"&gt;Create a free Honeycomb account&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Create an environment named “Uptime Monitors” and &lt;a href="https://docs.honeycomb.io/get-started/configure/environments/manage-api-keys/"&gt;create an ingest key&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Back in Cloudflare, take that ingest key and copy-paste it into a &lt;a href="https://developers.cloudflare.com/workers/configuration/secrets/#via-the-dashboard"&gt;Cloudflare Workers secret&lt;/a&gt; named “HONEYCOMB_API_KEY”&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://developers.cloudflare.com/workers/configuration/cron-triggers/#via-the-dashboard"&gt;Add a cron&lt;/a&gt; of “* * * * *” to the worker&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;(Confirm that traces are appearing every minute in Honeycomb)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;In Honeycomb, &lt;a href="https://docs.honeycomb.io/notify/alert/triggers/create/"&gt;create a trigger&lt;/a&gt; (alert) based on the query&lt;br&gt;
&lt;/p&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="nx"&gt;COUNT&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="nx"&gt;where&lt;/span&gt; &lt;span class="nx"&gt;http&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status_code&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;400&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Route the trigger’s notifications as needed (e.g. to Slack)&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;You should now have a functioning uptime monitor for your endpoint.&lt;/p&gt;

&lt;h3&gt;
  
  
  Comparison to the Other Options
&lt;/h3&gt;

&lt;p&gt;Compared to the other options outlined before, this solution has&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Minimal maintenance (just a single npm package and its dependencies to monitor for security vulnerabilities)&lt;/li&gt;
&lt;li&gt;Minimal complexity (just the steps outlined above)&lt;/li&gt;
&lt;li&gt;Totally free (the usage is very much within the &lt;a href="https://developers.cloudflare.com/workers/platform/pricing/#workers"&gt;Cloudflare Workers free tier&lt;/a&gt; and &lt;a href="https://www.honeycomb.io/pricing"&gt;Honeycomb free tier&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Takeaway
&lt;/h2&gt;

&lt;p&gt;Paying for an uptime monitor service is probably preferable to this (if you’re able to).&lt;/p&gt;

&lt;p&gt;The real takeaway is that there is this newer form of compute (worker runtimes) with a cost model that can be taken advantage of for situations similar to this.&lt;/p&gt;

</description>
      <category>devops</category>
      <category>monitoring</category>
    </item>
    <item>
      <title>Change Plans: A Subtle Superpower</title>
      <dc:creator>Grunet</dc:creator>
      <pubDate>Tue, 21 May 2024 16:14:35 +0000</pubDate>
      <link>https://dev.to/grunet/change-plans-a-subtle-superpower-52km</link>
      <guid>https://dev.to/grunet/change-plans-a-subtle-superpower-52km</guid>
      <description>&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;What is a Change Plan?&lt;/li&gt;
&lt;li&gt;
What Benefits Do Change Plans Bring?

&lt;ul&gt;
&lt;li&gt;Force You To Think&lt;/li&gt;
&lt;li&gt;Peer Review&lt;/li&gt;
&lt;li&gt;Drive Clarification&lt;/li&gt;
&lt;li&gt;Facilitate Discussion&lt;/li&gt;
&lt;li&gt;Discoverable&lt;/li&gt;
&lt;li&gt;Auditable&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;
A Change Plan Template in Detail

&lt;ul&gt;
&lt;li&gt;Summary&lt;/li&gt;
&lt;li&gt;Impact&lt;/li&gt;
&lt;li&gt;Security&lt;/li&gt;
&lt;li&gt;Communication Plan&lt;/li&gt;
&lt;li&gt;Test plan&lt;/li&gt;
&lt;li&gt;Before the change&lt;/li&gt;
&lt;li&gt;Steps&lt;/li&gt;
&lt;li&gt;Monitoring&lt;/li&gt;
&lt;li&gt;Backout plan&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;Takeaway&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What is a Change Plan?
&lt;/h2&gt;

&lt;p&gt;A change plan is a document describing a plan to make a nonstandard change to a production environment. For example, manually making changes to hand-curated virtual machines.&lt;/p&gt;

&lt;p&gt;The outline of a change plan might look something like this&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Summary&lt;/li&gt;
&lt;li&gt;Impact&lt;/li&gt;
&lt;li&gt;Security&lt;/li&gt;
&lt;li&gt;Communication Plan&lt;/li&gt;
&lt;li&gt;Test Plan&lt;/li&gt;
&lt;li&gt;Before the change&lt;/li&gt;
&lt;li&gt;Steps&lt;/li&gt;
&lt;li&gt;Monitoring&lt;/li&gt;
&lt;li&gt;Backout Plan&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Before diving into each of these sections, let’s discuss why change plans are helpful to begin with.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Benefits Do Change Plans Bring?
&lt;/h2&gt;

&lt;p&gt;There are several distinct benefits change plans bring to the table.&lt;/p&gt;

&lt;h3&gt;
  
  
  Force You To Think
&lt;/h3&gt;

&lt;p&gt;Like any template, a change plan template prompts you to consider each section carefully, determine whether or not it’s applicable for this change, and then fill out the section if so.&lt;/p&gt;

&lt;p&gt;Without a change plan template, it can be easy to forget important aspects of a change, e.g. having a backout plan.&lt;/p&gt;

&lt;h3&gt;
  
  
  Peer Review
&lt;/h3&gt;

&lt;p&gt;Just like for code review, peer review of change plans can be powerful. Not only do you get extra scrutiny of the plan, but you can also get bidirectional knowledge sharing between the participants.&lt;/p&gt;

&lt;p&gt;Without a change plan, there is no knowledge sharing and there’s increased risk of the change going awry.&lt;/p&gt;

&lt;h3&gt;
  
  
  Drive Clarification
&lt;/h3&gt;

&lt;p&gt;The steps of a change plan need to be detailed down to a point where anyone could follow them. This forces any ambiguous language to be clarified, increasing the likelihood of the steps being followed correctly and with the correct outcomes.&lt;/p&gt;

&lt;p&gt;Without a change plan, the steps might be determined on-the-fly and may result in mistakes being made.&lt;/p&gt;

&lt;h3&gt;
  
  
  Facilitate Discussion
&lt;/h3&gt;

&lt;p&gt;With a change plan external participants can comment on and engage in discussions about the change. For example, a Product Manager might request the date of a change be moved since it overlaps with a big feature release.&lt;/p&gt;

&lt;p&gt;Without a change plan there’s no artifact to structure discussion around, and worse external stakeholders might not be aware a change is happening at all.&lt;/p&gt;

&lt;h3&gt;
  
  
  Discoverable
&lt;/h3&gt;

&lt;p&gt;With a change plan, the change exists in documented history and can be examined by people in the future. For example, people trying to make a similar change might review and learn from it.&lt;/p&gt;

&lt;p&gt;Without a change plan, the details of the change are lost after it’s performed and no one other than the performer knows about it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Auditable
&lt;/h3&gt;

&lt;p&gt;A change plan can serve as an artifact for auditors to confirm that you’re following change management procedures correctly.&lt;/p&gt;

&lt;p&gt;Without a change plan some other artifact needs to be created for auditing purposes.&lt;/p&gt;

&lt;h2&gt;
  
  
  A Change Plan Template in Detail
&lt;/h2&gt;

&lt;p&gt;What follows is an example of a change plan template used at a previous job of mine.&lt;/p&gt;

&lt;h3&gt;
  
  
  Summary
&lt;/h3&gt;

&lt;p&gt;A few sentences to let the reviewer know what are we doing and why are we doing it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Impact
&lt;/h3&gt;

&lt;p&gt;How many customers or internal users will be impacted if things go right? How about if things go south / pear shaped / blow up?&lt;/p&gt;

&lt;h3&gt;
  
  
  Security
&lt;/h3&gt;

&lt;p&gt;This change increases | temporarily decreases | decreases | has no impact to security. Another sentence or two to justify why temporarily or permanently decreasing security is a good idea.&lt;/p&gt;

&lt;h3&gt;
  
  
  Communication Plan
&lt;/h3&gt;

&lt;p&gt;How are you going to communicate to internal users, support, etc. that the change is happening? Consider that if the change impacts users or causes downtime, you may need to communicate the change weeks in advance.&lt;/p&gt;

&lt;h3&gt;
  
  
  Test plan
&lt;/h3&gt;

&lt;p&gt;If this is a high risk or complex change (or not easy to back out), how are you going to test this first? If you are not going to test it first, justify it was either easy to back out or otherwise low risk. A reviewer might have suggestions of what needs to be tested or how to test.&lt;/p&gt;

&lt;h3&gt;
  
  
  Before the change
&lt;/h3&gt;

&lt;p&gt;What steps are you going to take to prepare for the change or stage or test things before you do the change?&lt;/p&gt;

&lt;h3&gt;
  
  
  Steps
&lt;/h3&gt;

&lt;p&gt;Step 1&lt;/p&gt;

&lt;p&gt;Backup or save current state…&lt;/p&gt;

&lt;p&gt;Step 2&lt;/p&gt;

&lt;p&gt;Do something&lt;/p&gt;

&lt;p&gt;Run a command:&lt;/p&gt;

&lt;p&gt;this is a command that you run that you will copy/paste during the change&lt;/p&gt;

&lt;p&gt;Test that the change worked!&lt;/p&gt;

&lt;p&gt;This is some expected output you should see&lt;/p&gt;

&lt;p&gt;If it didn’t work, do the backout steps, try again… be specific as to what happens if things go wrong.&lt;/p&gt;

&lt;p&gt;Any follow up or cleanup steps&lt;/p&gt;

&lt;h3&gt;
  
  
  Monitoring
&lt;/h3&gt;

&lt;p&gt;What can we monitor to know that this change worked?&lt;/p&gt;

&lt;p&gt;How can check for unexpected side-effects we may have on the application?&lt;/p&gt;

&lt;p&gt;What other parts of the application could be affected by this change?&lt;/p&gt;

&lt;h3&gt;
  
  
  Backout plan
&lt;/h3&gt;

&lt;p&gt;The same as Steps but specifically how you would back out changes. If you can’t back out the change, note it here.&lt;/p&gt;

&lt;p&gt;Undo some stuff&lt;/p&gt;

&lt;p&gt;Undo some other stuff&lt;/p&gt;

&lt;p&gt;Check that the undoing worked&lt;/p&gt;

&lt;h2&gt;
  
  
  Takeaway
&lt;/h2&gt;

&lt;p&gt;Change plans are an excellent tool to help you manage your ever changing production environments. While they may seem like pure overhead at first, they shine when faced with uncertain, complex, or risky changes.&lt;/p&gt;

</description>
      <category>operations</category>
      <category>devops</category>
      <category>codereview</category>
    </item>
    <item>
      <title>Gushing Over AWS Application Load Balancer Access Logs</title>
      <dc:creator>Grunet</dc:creator>
      <pubDate>Wed, 15 May 2024 20:26:49 +0000</pubDate>
      <link>https://dev.to/grunet/gushing-over-aws-application-load-balancer-access-logs-2bf</link>
      <guid>https://dev.to/grunet/gushing-over-aws-application-load-balancer-access-logs-2bf</guid>
      <description>&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;What They Are&lt;/li&gt;
&lt;li&gt;
Why They Are Great

&lt;ul&gt;
&lt;li&gt;Chock Full of Details&lt;/li&gt;
&lt;li&gt;Close to End-User Behavior and Pain&lt;/li&gt;
&lt;li&gt;Non-Invasive to Enable&lt;/li&gt;
&lt;li&gt;Supported by Vendors&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

Some Frustrations

&lt;ul&gt;
&lt;li&gt;Batching&lt;/li&gt;
&lt;li&gt;Poor Integrations&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Takeaway&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  What They Are
&lt;/h2&gt;

&lt;p&gt;Auto-generated logs that capture details on each request that passes through an Application Load Balancer (ALB) and onward to a backend target. (&lt;a href="https://docs.aws.amazon.com/elasticloadbalancing/latest/application/load-balancer-access-logs.html" rel="noopener noreferrer"&gt;Here is their main AWS doc page&lt;/a&gt;)&lt;/p&gt;

&lt;h2&gt;
  
  
  Why They Are Great
&lt;/h2&gt;

&lt;p&gt;There are multiple reasons to get excited about these logs.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fla2pg6giyus13kozvzj1.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fla2pg6giyus13kozvzj1.jpg" alt="An anime character with mouth open wide and eyes widened in fascination of something. The character is Boji from Ranking of Kings."&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Chock Full of Details
&lt;/h3&gt;

&lt;p&gt;ALB access logs include a huge amount of information on each request, for example&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Request Method&lt;/li&gt;
&lt;li&gt;Request Path (including query parameters)&lt;/li&gt;
&lt;li&gt;Client Ip Address&lt;/li&gt;
&lt;li&gt;User Agent&lt;/li&gt;
&lt;li&gt;Request Start Time&lt;/li&gt;
&lt;li&gt;Request End Time&lt;/li&gt;
&lt;li&gt;Request Duration&lt;/li&gt;
&lt;li&gt;Load Balancer Status Code&lt;/li&gt;
&lt;li&gt;Target Status Code&lt;/li&gt;
&lt;li&gt;Target Internal Ip Address&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And that’s just to mention a few! (&lt;a href="https://docs.aws.amazon.com/elasticloadbalancing/latest/application/load-balancer-access-logs.html#access-log-entry-syntax" rel="noopener noreferrer"&gt;Here is the full list of attributes&lt;/a&gt;)&lt;/p&gt;

&lt;p&gt;With just these attributes you can do things like&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Look for requests coming from a particular ip address&lt;/li&gt;
&lt;li&gt;Look for requests blocked by the Web Application Firewall (403s for the load balancer status code with no target status code)&lt;/li&gt;
&lt;li&gt;Look for a service that temporarily went down (502s for the load balancer status code with no target status code)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Close to End-User Behavior and Pain
&lt;/h3&gt;

&lt;p&gt;Access logs from a public-facing ALB give the closest representation of what end-users are doing with your application (client-side instrumentation aside) since they record every single request (no sampling).&lt;/p&gt;

&lt;p&gt;They also give the best representation of the pain your end-users are facing (e.g. looking at 5xx’s the load balancer is returning).&lt;/p&gt;

&lt;p&gt;Backend instrumentation alone will always be missing part of the picture (e.g. when a backend service is hard down, sampling).&lt;/p&gt;

&lt;h3&gt;
  
  
  Non-Invasive to Enable
&lt;/h3&gt;

&lt;p&gt;No application code changes or instrumentation with 3rd party agents/libraries are required to turn on ALB access logs.&lt;/p&gt;

&lt;p&gt;Just &lt;a href="https://docs.aws.amazon.com/elasticloadbalancing/latest/application/enable-access-logging.html" rel="noopener noreferrer"&gt;enable them&lt;/a&gt; like you configure the other parts of your infrastructure via IaC, the CLI, or ClickOps. Then watch the log files start to show up in S3.&lt;/p&gt;

&lt;h3&gt;
  
  
  Supported by Vendors
&lt;/h3&gt;

&lt;p&gt;Most observability vendors will offer a solution for ingesting ALB access log files into their platform (e.g. offering a lambda that can trigger off the log files’ S3 bucket new object created event) for querying there.&lt;/p&gt;

&lt;p&gt;Alternatively, Athena can be used to query them from within AWS.&lt;/p&gt;

&lt;h2&gt;
  
  
  Some Frustrations
&lt;/h2&gt;

&lt;p&gt;To be honest these logs aren't perfect and have a few notable downsides.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2jzvs04m7utffe9m8foy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2jzvs04m7utffe9m8foy.png" alt="A slightly chubby yellow creature that looks like a duck without a neck standing upright on its back flippers. It's holding its front flippers to the sides of its head. It's Psyduck the pokemon having a headache."&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Batching
&lt;/h3&gt;

&lt;p&gt;An ALB will batch all the access logs it generates and send it to the S3 bucket containing them once every 5 minutes. This means that it’s not suitable for situations where human response times to changes need to be faster than that. This is one area where backend instrumentation wins out.&lt;/p&gt;

&lt;p&gt;(I believe GCP’s analogs don’t have this limitation fyi.)&lt;/p&gt;

&lt;h3&gt;
  
  
  Poor Integrations
&lt;/h3&gt;

&lt;p&gt;Web Application Firewalls (WAF) can be configured to record their own logs of every request sent to a load balancer configured to use a firewall, but as far as I know there’s no way to integrate them with ALB access logs. They are a totally separate source of info.&lt;/p&gt;

&lt;p&gt;Also ALB access logs can’t take part in OpenTelemetry tracing as far as I know (though it would be pretty cool if they did)&lt;/p&gt;

&lt;h2&gt;
  
  
  Takeaway
&lt;/h2&gt;

&lt;p&gt;If you’re stranded on an island and you can only have 1 piece of telemetry, I’d recommend picking ALB access logs. They have their limitations, but they deliver the most value for the least investment in my opinion.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>devops</category>
      <category>monitoring</category>
    </item>
    <item>
      <title>Handling Concurrent Load During an AWS Outage: A Tradeoff To Consider</title>
      <dc:creator>Grunet</dc:creator>
      <pubDate>Sun, 06 Aug 2023 01:39:05 +0000</pubDate>
      <link>https://dev.to/grunet/handling-concurrent-load-during-an-aws-outage-a-tradeoff-to-consider-28h2</link>
      <guid>https://dev.to/grunet/handling-concurrent-load-during-an-aws-outage-a-tradeoff-to-consider-28h2</guid>
      <description>&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;The Main Compute Primitives for AWS&lt;/li&gt;
&lt;li&gt;
Control Planes vs Data Planes

&lt;ul&gt;
&lt;li&gt;Static Stability&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;
How Each Primitive Handles Concurrency During a Control Plane Outage

&lt;ul&gt;
&lt;li&gt;EC2&lt;/li&gt;
&lt;li&gt;Fargate&lt;/li&gt;
&lt;li&gt;Lambda&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;The Takeaway&lt;/li&gt;
&lt;li&gt;References&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Main Compute Primitives for AWS
&lt;/h2&gt;

&lt;p&gt;There are 3 compute primitives in AWS (Amazon Web Services) that almost all of its other compute offerings are built on top of&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Virtual Machines (i.e. EC2, Elastic Cloud Compute)&lt;/li&gt;
&lt;li&gt;Containers (e.g. Fargate for ECS, Elastic Container Service, or EKS, Elastic Kubernetes Service)&lt;/li&gt;
&lt;li&gt;Functions (i.e. Lambda)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each comes with its own set of tradeoffs, but there is one subtle tradeoff that only manifests during certain AWS outages.&lt;/p&gt;

&lt;p&gt;To understand that tradeoff, we first need to understand the concepts of “control planes” and “data planes” of AWS services.&lt;/p&gt;

&lt;h2&gt;
  
  
  Control Planes vs Data Planes
&lt;/h2&gt;

&lt;p&gt;Every AWS compute service is separated into 2 logical components, a control plane and a data plane.&lt;/p&gt;

&lt;p&gt;The data plane is responsible for actually running the hardware and software powering the compute. Think of the physical server running a virtual machine, for example.&lt;/p&gt;

&lt;p&gt;The control plane is responsible for making changes to the data plane. If you want to add a new virtual machine to the data plane, you have to make a request to the control plane for it to do so on your behalf.&lt;/p&gt;

&lt;h3&gt;
  
  
  Static Stability
&lt;/h3&gt;

&lt;p&gt;Services are designed this way in part to be more fault tolerant. If an outage occurs in the control plane, the data plane will continue working without issue. &lt;/p&gt;

&lt;p&gt;And in general, outages in control planes are more common than outages in data planes.&lt;/p&gt;

&lt;p&gt;This leads to the concept of “static stability”, where as long as your workload doesn’t depend on control planes, it will remain stable during most AWS outages.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Each Primitive Handles Concurrency During a Control Plane Outage
&lt;/h2&gt;

&lt;p&gt;But your existing workloads being stable during an AWS outage might not be enough. What if there’s a surge in load that they need to respond to? Will they be able to scale up to meet that demand?&lt;/p&gt;

&lt;p&gt;Specifically, there’s the question of the maximum concurrency a workload can support during an AWS service control plane outage.&lt;/p&gt;

&lt;p&gt;The answer to this question (perhaps surprisingly) depends on the compute primitive involved.&lt;/p&gt;

&lt;h3&gt;
  
  
  EC2
&lt;/h3&gt;

&lt;p&gt;In normal times, in the face of increased concurrency a workload can autoscale up to handle it (e.g. an ASG, Autoscaling Group, can bring up more virtual machines).&lt;/p&gt;

&lt;p&gt;However, during an outage of the EC2 control plane, this isn’t possible (i.e. autoscaling requires requests to the control plane).&lt;/p&gt;

&lt;p&gt;This means that during the outage, the maximum concurrency a workload can support is fixed and cannot be increased. Any requests exceeding this limit will fail.&lt;/p&gt;

&lt;h3&gt;
  
  
  Fargate
&lt;/h3&gt;

&lt;p&gt;Fargate behaves similarly to EC2, as starting new tasks requires a request to the ECS or EKS control plane.&lt;/p&gt;

&lt;p&gt;So during a control plane outage, any requests exceeding the fixed maximum concurrency of the workload will fail.&lt;/p&gt;

&lt;h3&gt;
  
  
  Lambda
&lt;/h3&gt;

&lt;p&gt;Lambda is the odd duck out. &lt;/p&gt;

&lt;p&gt;In normal times, in the face of increased concurrency a workload can start up multiple new Lambda execution environments to handle the load.&lt;/p&gt;

&lt;p&gt;But the subtlety here is that this behavior is part of the Lambda data plane, NOT the control plane.&lt;/p&gt;

&lt;p&gt;This means that during an outage of the Lambda control plane, a workload can still handle essentially arbitrary concurrency of requests (only limited by your account’s quota on concurrent executions).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--iPIUKoVS--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/yn5hpw73lvw3099mls3t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--iPIUKoVS--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/yn5hpw73lvw3099mls3t.png" alt="The AWS lambda icon on top of a sports podium for awarding medals like you'd find in the Olympics. There is a 1 beneath the lambda icon, then a 2 to the left of it, and a 3 to the right of it. Nothing is on top of the 2nd or 3rd place spots, they are empty." width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Takeaway
&lt;/h2&gt;

&lt;p&gt;If you need to handle arbitrary concurrency while the control plane of a service is impaired, Lambda provides the best tradeoff.&lt;/p&gt;

&lt;p&gt;Fargate or EC2 (or any other more managed service built on top of them, e.g. Elastic Beanstalk) will not be able to meet the need.&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;a href="https://docs.aws.amazon.com/pdfs/whitepapers/latest/aws-fault-isolation-boundaries/aws-fault-isolation-boundaries.pdf"&gt;AWS whitepaper on fault isolation boundaries&lt;/a&gt; that defines “static stability” and references the lack of ability of EC2-based workloads to autoscale during control plane outages&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/whitepapers/latest/security-overview-aws-lambda/lambda-executions.html"&gt;Lambda whitepaper that says scaling occurs at the level of the data plane&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>aws</category>
      <category>devops</category>
      <category>operations</category>
      <category>sre</category>
    </item>
    <item>
      <title>Operational Challenges for SCIM Servers</title>
      <dc:creator>Grunet</dc:creator>
      <pubDate>Sun, 18 Jun 2023 17:17:47 +0000</pubDate>
      <link>https://dev.to/grunet/operational-challenges-for-scim-servers-176a</link>
      <guid>https://dev.to/grunet/operational-challenges-for-scim-servers-176a</guid>
      <description>&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;What is SCIM?&lt;/li&gt;
&lt;li&gt;
The Key Operational Challenges

&lt;ul&gt;
&lt;li&gt;No Load Limits&lt;/li&gt;
&lt;li&gt;All Requests Must be Synchronously Handled&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;
Downstream Consequences when Faced with “Scale”

&lt;ul&gt;
&lt;li&gt;Performance Problems with 3rd Party SCIM Libraries&lt;/li&gt;
&lt;li&gt;ORM Problems Leading to Database Performance Problems Leading to Other ORM Problems&lt;/li&gt;
&lt;li&gt;Inability to Throttle Requests&lt;/li&gt;
&lt;li&gt;Inability to Queue Requests&lt;/li&gt;
&lt;li&gt;Inability to Horizontally Scale&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;
Takeaways

&lt;ul&gt;
&lt;li&gt;Targeted Avoidance of the ORM-to-SCIM-to-ORM Pattern is Valuable&lt;/li&gt;
&lt;li&gt;Threading is Potentially Valuable, Maybe&lt;/li&gt;
&lt;li&gt;Pre-Production Load Testing of SCIM Servers is Valuable&lt;/li&gt;
&lt;li&gt;Recording Failed SCIM Requests in Production Telemetry is Valuable&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What is SCIM?
&lt;/h2&gt;

&lt;p&gt;SCIM (System for Cross-domain Identity Management) is an open standard for user provisioning.&lt;/p&gt;

&lt;p&gt;For example, it allows an organization that is a customer of a SaaS product to easily sync all of their users’ identity information into the SaaS’s databases.&lt;/p&gt;

&lt;p&gt;The organization’s users’ identity information will often be stored in a 3rd party identity provider like Okta or Azure Active Directory (Azure AD). The identity provider will act as a SCIM client, sending requests with provisioning data to a SCIM server managed by the SaaS product.&lt;/p&gt;

&lt;p&gt;Building a conformant and operational SCIM server however is a non-trivial task.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Key Operational Challenges
&lt;/h2&gt;

&lt;p&gt;There are 2 inherent operational difficulties all SCIM server implementations must face.&lt;/p&gt;

&lt;h3&gt;
  
  
  No Load Limits
&lt;/h3&gt;

&lt;p&gt;A SCIM client has no restrictions on how quickly it can send requests to your SCIM server.&lt;/p&gt;

&lt;p&gt;In theory, this means that your SCIM server needs to be able to handle arbitrary requests at arbitrary concurrency.&lt;/p&gt;

&lt;p&gt;In practice, this means that your SCIM server needs to be able to handle the load of “the worst” SCIM client out there (Azure AD’s SCIM client has been observed to send somewhere just shy of 1000 requests per minute at peak load)&lt;/p&gt;

&lt;h3&gt;
  
  
  All Requests Must be Synchronously Handled
&lt;/h3&gt;

&lt;p&gt;SCIM clients rely on the status code of responses to determine whether or not to retry a request (e.g. a client may retry all 500-level errors until it succeeds).&lt;/p&gt;

&lt;p&gt;This means that all request processing has to be done synchronously (i.e. it can’t be offloaded to consumers of a separate queue).&lt;/p&gt;

&lt;h2&gt;
  
  
  Downstream Consequences when Faced with “Scale”
&lt;/h2&gt;

&lt;p&gt;When faced with increased load and larger customers, several different types of problems can arise for a SCIM server implementation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Performance Problems with 3rd Party SCIM Libraries
&lt;/h3&gt;

&lt;p&gt;Handling SCIM requests means writing the logic for applying a request’s changes to a SCIM resource (e.g. PATCH-ing a group).&lt;/p&gt;

&lt;p&gt;Depending on your runtime, there may be existing libraries that already have this logic ready for re-use (e.g. &lt;a href="https://www.npmjs.com/package/scim-patch"&gt;scim-patch&lt;/a&gt; for PATCH requests in Node.js). However, a library’s code may not necessarily be optimized for performance in every case. &lt;/p&gt;

&lt;p&gt;For example, a library function’s execution time may sometimes scale quadratically with the size of the request and/or SCIM resource (e.g. for groups with a large number of users). This can drastically slow down response times (think 10s of seconds) and hog server CPU (a lethal problem for single-threaded runtimes like Node.js).&lt;/p&gt;

&lt;h3&gt;
  
  
  ORM Problems Leading to Database Performance Problems Leading to Other ORM Problems
&lt;/h3&gt;

&lt;p&gt;If you’re using an ORM (Object Relational Mapping), all of your PATCH or PUT SCIM endpoints may work as follows&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Load the ORM’s representation of the SCIM resource from the database&lt;/li&gt;
&lt;li&gt;Transform the ORM’s representation into a SCIM representation expected by the SCIM library&lt;/li&gt;
&lt;li&gt;Apply the request to the SCIM representation using the SCIM library&lt;/li&gt;
&lt;li&gt;Transform the updated SCIM representation back into an ORM representation of the resource&lt;/li&gt;
&lt;li&gt;Save the ORM representation back to the database&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The problem with this can occur in Step 5, as the ORM has no idea of what exactly changed and so can in certain cases end up recreating the database records from scratch.&lt;/p&gt;

&lt;p&gt;For example, consider PATCH-ing a very large group (10,000 or more users) to add 1 user to the group, where in the database users in a group are represented in a join table between the Groups and Users tables.  In Step 5, the ORM isn’t aware that 1 user was added, so instead it generates SQL to delete all the existing users from the group and then re-insert them all plus the 1 new user.&lt;/p&gt;

&lt;p&gt;This can become extremely problematic when these requests happen at high concurrency (e.g. Azure AD sending a huge number of near parallel requests to add 1 user at a time to a group). Every request requires an exclusive lock on the join table because of the deletes, so the transactions end up being processed 1 at a time at the database-level, leading to very slow query times.&lt;/p&gt;

&lt;p&gt;Because of these delays in database processing, ORM operations will start to back up in their queue and time out. ORM database connection pools will become maxed out as well. This will cause a massive percentage (e.g. 95+%) of the requests to fail. So many that no amount of retrying from the SCIM client will fix things.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--y_Ol_sLm--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/bqsy0rh0iv3xzb5m7gqp.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--y_Ol_sLm--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/bqsy0rh0iv3xzb5m7gqp.jpg" alt="The &amp;quot;This is fine&amp;quot; popular internet meme. A two panel comic. The first panel shows an anthropomorphized dog sitting on a chair next to a table with a coffee mug on it. The room the dog is in is covered in flames and smoke covers the ceiling. The dog's eyes do not seem to betray any fear of the situation. In the second panel we zoom in on the dog's face where the dog says this is fine in a speech bubble that appears above their face. " width="561" height="265"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Inability to Throttle Requests
&lt;/h3&gt;

&lt;p&gt;Because of the severe slowdown and serialization in request processing under this high concurrency, throttling is ineffective as it will just lead to timeouts at the gateway instead (e.g. 504s from the load balancer fronting the SCIM server replicas) .&lt;/p&gt;

&lt;h3&gt;
  
  
  Inability to Queue Requests
&lt;/h3&gt;

&lt;p&gt;Because SCIM requires that requests be synchronous, putting the requests onto a separate queue (e.g. an SQS queue) to avoid all of the aforementioned problems won’t work because if a queue consumer fails to process a request for some reason there’s no way to indicate that to the SCIM client so it knows to retry.&lt;/p&gt;

&lt;h3&gt;
  
  
  Inability to Horizontally Scale
&lt;/h3&gt;

&lt;p&gt;SCIM clients’ barrage of requests can start suddenly and end just as quickly (think minutes) which isn’t enough time for traditional autoscaling tools to respond by creating more replicas of the SCIM servers.&lt;/p&gt;

&lt;p&gt;Also in the case of the database lock contention issue mentioned before, more replicas (and hence more available database connections) can actually make things worse, as the line of concurrent transactions waiting on the database lock will grow even longer.&lt;/p&gt;

&lt;h2&gt;
  
  
  Takeaways
&lt;/h2&gt;

&lt;p&gt;Stepping back, there are a few high-level things to take away from the previous pessimism. &lt;/p&gt;

&lt;h3&gt;
  
  
  Targeted Avoidance of the ORM-to-SCIM-to-ORM Pattern is Valuable
&lt;/h3&gt;

&lt;p&gt;In the large group scenario from before, if all of the requests happen to be adding or removing 1 user from a group, making that 1 change directly to the database (via the ORM) rather than creating an intermediate SCIM representation of the group that the ORM then has to save back to the database can avoid all of the aforementioned problems.&lt;/p&gt;

&lt;h3&gt;
  
  
  Threading is Potentially Valuable, Maybe
&lt;/h3&gt;

&lt;p&gt;As an alternative to using a separate queue, using separate runtime threads and an in-memory request queue may help avoid saturation in the specific case of a server CPU bottleneck coming from request processing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pre-Production Load Testing of SCIM Servers is Valuable
&lt;/h3&gt;

&lt;p&gt;Simulating the type and concurrency of requests that SCIM clients may deliver should be a valuable exercise, as it may unearth several bottlenecks in the SCIM server that may not become otherwise apparent until production.&lt;/p&gt;

&lt;p&gt;You might try forking &lt;a href="https://github.com/wso2-incubator/scim2-compliance-test-suite"&gt;https://github.com/wso2-incubator/scim2-compliance-test-suite&lt;/a&gt; and adjusting it to be able to send requests at high concurrency towards this end.&lt;/p&gt;

&lt;h3&gt;
  
  
  Recording Failed SCIM Requests in Production Telemetry is Valuable
&lt;/h3&gt;

&lt;p&gt;The first step in addressing a novel operational problem happening with your SCIM servers in production is usually to develop some understanding of it. &lt;/p&gt;

&lt;p&gt;On top of your usual sources of telemetry (e.g. OpenTelemetry, Application Performance Monitoring tools, RDS Performance Monitoring) recording the raw SCIM request data of failed requests in your telemetry (e.g. logs, traces) can be very helpful in figuring out what exactly is going on.&lt;/p&gt;

</description>
      <category>devops</category>
      <category>scim</category>
      <category>performance</category>
    </item>
    <item>
      <title>The Hidden Tradeoff of Keyless Auth</title>
      <dc:creator>Grunet</dc:creator>
      <pubDate>Thu, 18 May 2023 02:19:46 +0000</pubDate>
      <link>https://dev.to/grunet/the-hidden-tradeoff-of-keyless-auth-48o7</link>
      <guid>https://dev.to/grunet/the-hidden-tradeoff-of-keyless-auth-48o7</guid>
      <description>&lt;h2&gt;
  
  
  What is Keyless Auth and Why Should I Care?
&lt;/h2&gt;

&lt;p&gt;Keyless auth refers to being able to authenticate to a system without using any long-lived credentials.&lt;/p&gt;

&lt;p&gt;This means getting access to a non-public system without a username/password, a public/private key pair, an access key, etc… while (somewhat magically) maintaining security.&lt;/p&gt;

&lt;p&gt;Here are a few places using it today&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Getting AWS access from a Github Actions workflow&lt;/li&gt;
&lt;li&gt;SSH-ing into VMs using Teleport&lt;/li&gt;
&lt;li&gt;Signing artifacts using cosign and sigstore&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You should care because any long-lived credential you can get rid of is 1 less target for attackers to compromise.&lt;/p&gt;

&lt;h2&gt;
  
  
  How do Keyless Auth Systems Work?
&lt;/h2&gt;

&lt;p&gt;There are usually 3 parties involved in an auth interaction&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The Requestor (e.g. a Github Actions workflow run)&lt;/li&gt;
&lt;li&gt;The Identity Provider (e.g. Github’s OIDC provider)&lt;/li&gt;
&lt;li&gt;The Resource Provider (e.g. an AWS account)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The flow then goes something like this&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The Requestor wants to access the Resource Provider&lt;/li&gt;
&lt;li&gt;The Requestor asks the Identity Provider for a token capturing the identity of the Requestor&lt;/li&gt;
&lt;li&gt;The Identity Provider vends it a token&lt;/li&gt;
&lt;li&gt;The Requestor sends the token to the Resource Provider&lt;/li&gt;
&lt;li&gt;The Resource Provider then sends the token back to the Identity Provider, asking if this is a valid request&lt;/li&gt;
&lt;li&gt;The Identity Provider confirms it just made that token and it’s expected&lt;/li&gt;
&lt;li&gt;The Resource Provider allows the Requestor time-limited access via temporary credentials&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The critical part here is the Identity Provider (e.g. Github’s OIDC provider) and the Resource Provider (e.g. an AWS account) have already previously established a trust relationship via configuration inside the Resource Provider. That’s what enables the Resource Provider to trust that the token isn’t malicious.&lt;/p&gt;

&lt;p&gt;But there’s a catch to this that no one seems to talks about. &lt;/p&gt;

&lt;h2&gt;
  
  
  The Hidden, Unspoken Tradeoff
&lt;/h2&gt;

&lt;p&gt;Imagine that Github’s OIDC provider were to get compromised (it’s not inconceivable. It’s a massive target just like LastPass or CircleCI were). This would then mean that malicious actors could also get access to any AWS accounts configured for keyless auth from Github Actions.&lt;/p&gt;

&lt;p&gt;The same exploit is not necessarily possible if you’re managing your own long-lived AWS access keys. You can make securing them fully independent of the security position of Github’s OIDC provider.&lt;/p&gt;

&lt;p&gt;So the tradeoff in general is putting trust in the Identity Provider’s security at the expense of losing control of the security surface for your Resource Provider.&lt;/p&gt;

&lt;h2&gt;
  
  
  Takeaways
&lt;/h2&gt;

&lt;p&gt;I will continue to use keyless auth solutions as I think the tradeoff is almost always worth it.&lt;/p&gt;

&lt;p&gt;However, I will now think twice about the vendors involved before jumping for it.&lt;/p&gt;

</description>
      <category>security</category>
      <category>cicd</category>
    </item>
    <item>
      <title>The Lack of Disabled Peoples' Experiences in Web Accessibility is Concerning</title>
      <dc:creator>Grunet</dc:creator>
      <pubDate>Thu, 18 May 2023 01:51:35 +0000</pubDate>
      <link>https://dev.to/grunet/the-lack-of-disabled-peoples-experiences-in-web-accessibility-is-concerning-5d8g</link>
      <guid>https://dev.to/grunet/the-lack-of-disabled-peoples-experiences-in-web-accessibility-is-concerning-5d8g</guid>
      <description>&lt;p&gt;If you've ever done frontend work around accessibility, odds are the following are true&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You are abled&lt;/li&gt;
&lt;li&gt;You never met an affected disabled person in the course of the work&lt;/li&gt;
&lt;li&gt;You never learned if your changes actually helped disabled users&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You may have closed a ticket, or remediated a finding from an auditor, but any learning from real disabled peoples' experiences likely didn't happen.&lt;/p&gt;

&lt;p&gt;We all mostly assume this status quo is fine. That as long as the VPAT (Voluntary Product Accessibility Template) or the ACR (Accessibility Conformance Report) look good to customers, there's nothing else to worry about.&lt;/p&gt;

&lt;p&gt;Note that this mindset would be bizarre when applied to any other measurement of a product's functionality (e.g. the revenue it generates). Having someone come in yearly to disclose to you how much money your product is or isn't making would be unacceptable on several grounds. Yet we accept it for accessibility.&lt;/p&gt;

&lt;p&gt;It's easy to say teams should do more. Teams should involve disabled people at every phase of the SDLC. Companies should hire more inclusively. But this seldom happens as it's hard on multiple socio-organizational angles.&lt;/p&gt;

&lt;p&gt;There need to be more ways to draw from real disabled user experiences when creating on the web. Not just for large corporations with deep pockets, but small businesses too.&lt;/p&gt;

&lt;p&gt;Until something changes, the practice of accessibility will always remain concerning.&lt;/p&gt;

</description>
      <category>a11y</category>
      <category>frontend</category>
    </item>
    <item>
      <title>Leveraging OpenTelemetry in Deno</title>
      <dc:creator>Grunet</dc:creator>
      <pubDate>Sat, 08 Apr 2023 00:41:14 +0000</pubDate>
      <link>https://dev.to/grunet/leveraging-opentelemetry-in-deno-45bj</link>
      <guid>https://dev.to/grunet/leveraging-opentelemetry-in-deno-45bj</guid>
      <description>&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Background&lt;/li&gt;
&lt;li&gt;Goal&lt;/li&gt;
&lt;li&gt;
A Minimal Interesting Example

&lt;ul&gt;
&lt;li&gt;
The Boilerplate

&lt;ul&gt;
&lt;li&gt;Autoinstrumentation&lt;/li&gt;
&lt;li&gt;
Span Processing and Exporting

&lt;ul&gt;
&lt;li&gt;Exporting to the Console and to a Tracing Vendor &lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;/li&gt;

&lt;li&gt;

The App Code

&lt;ul&gt;
&lt;li&gt;OpenTelemetry’s API Surface&lt;/li&gt;
&lt;li&gt;Handling Concurrent Requests and the Need for Async Context&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;/li&gt;

&lt;li&gt;

Future Directions

&lt;ul&gt;
&lt;li&gt;Adding Structured Data to Spans&lt;/li&gt;
&lt;li&gt;To Deploy and Beyond&lt;/li&gt;
&lt;li&gt;Actually Knowing CPU Time per Request&lt;/li&gt;
&lt;li&gt;Autoinstrumenting Deno-specific APIs and Libraries&lt;/li&gt;
&lt;li&gt;Putting the “Distributed” in Distributed Tracing&lt;/li&gt;
&lt;li&gt;All the Pillars&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Parting Thoughts&lt;/li&gt;

&lt;li&gt;References&lt;/li&gt;

&lt;li&gt;Console Output&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Background
&lt;/h2&gt;

&lt;p&gt;Deno is a runtime for JavaScript, TypeScript, and WebAssembly that is based on the V8 JavaScript engine and the Rust programming language.&lt;/p&gt;

&lt;p&gt;OpenTelemetry (OTEL) is a collection of tools, APIs, and SDKs. It's used to instrument, generate, collect, and export telemetry data (metrics, logs, and traces) to help you analyze your software’s performance and behavior.&lt;/p&gt;

&lt;p&gt;Until relatively recently, it wasn’t possible to bring the power of OpenTelemetry to bear on Deno. All of us were missing out on the information OpenTelemetry can gather, in particular for tracing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Goal
&lt;/h2&gt;

&lt;p&gt;This article will go over 1 simplified example in depth of using OpenTelemetry for tracing in Deno. &lt;/p&gt;

&lt;p&gt;The aim is primarily to serve as an introduction to OpenTelemetry concepts for folks already somewhat familiar with Deno.&lt;/p&gt;

&lt;h2&gt;
  
  
  A Minimal Interesting Example
&lt;/h2&gt;

&lt;p&gt;Here is a visualization of 1 trace emitted by the example code (taken from Honeycomb, an observability vendor).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Floahutlkxrpum3p6xjn4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Floahutlkxrpum3p6xjn4.png" alt="3 rows, with 1 horizontal bar in each of varying lengths. The top bar is the longest and it says 16.34ms inside the bar. The 2nd bar says 12.00 ms inside the bar, and it starts a little after the top bar and ends a little after it. The 3rd bar says 0.1948ms and is very short, starting and ending just before the end of the top bar. To the left of each row there's a name given to each bar. The top bar is named handler. The second bar is named HTTP GET. The third bar is named construct body. There is a tree structure that indicates that the HTTP GET row and the contruct body row are both children of the handler row."&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The "HTTP GET" span ending after its parent is due to &lt;a href="https://github.com/open-telemetry/opentelemetry-js/issues/3719" rel="noopener noreferrer"&gt;a small bug in otel-js&lt;/a&gt;. For now, just pretend it ended before its parent did.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Note the following things you can tell without even looking at the code&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Different parts of the code have their duration measured&lt;/li&gt;
&lt;li&gt;Outgoing HTTP requests are captured&lt;/li&gt;
&lt;li&gt;The structure of the code is probably reflected in the structure of the diagram&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And now here is the code that was used to generate that telemetry.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;registerInstrumentations&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;npm:@opentelemetry/instrumentation&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;FetchInstrumentation&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;npm:@opentelemetry/instrumentation-fetch&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;NodeTracerProvider&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;npm:@opentelemetry/sdk-trace-node&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Resource&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;npm:@opentelemetry/resources&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;SemanticResourceAttributes&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;npm:@opentelemetry/semantic-conventions&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;BatchSpanProcessor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;ConsoleSpanExporter&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;npm:@opentelemetry/sdk-trace-base&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;OTLPTraceExporter&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;npm:@opentelemetry/exporter-trace-otlp-proto&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;opentelemetry&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;npm:@opentelemetry/api&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;serve&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://deno.land/std@0.180.0/http/server.ts&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// autoinstrumentation.ts&lt;/span&gt;

&lt;span class="nf"&gt;registerInstrumentations&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;instrumentations&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;FetchInstrumentation&lt;/span&gt;&lt;span class="p"&gt;()],&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// Monkeypatching to get past FetchInstrumentation's dependence on sdk-trace-web, which has runtime dependencies on some browser-only constructs. See https://github.com/open-telemetry/opentelemetry-js/issues/3413#issuecomment-1496834689 for more details&lt;/span&gt;
&lt;span class="c1"&gt;// Specifically for this line - https://github.com/open-telemetry/opentelemetry-js/blob/main/packages/opentelemetry-sdk-trace-web/src/utils.ts#L310&lt;/span&gt;
&lt;span class="nx"&gt;globalThis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{};&lt;/span&gt; 

&lt;span class="c1"&gt;// tracing.ts&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
  &lt;span class="nx"&gt;Resource&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;merge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Resource&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;SemanticResourceAttributes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;SERVICE_NAME&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;deno-demo-service&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;SemanticResourceAttributes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;SERVICE_VERSION&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;0.1.0&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;provider&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;NodeTracerProvider&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;consoleExporter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ConsoleSpanExporter&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="nx"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addSpanProcessor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;BatchSpanProcessor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;consoleExporter&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;traceExporter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OTLPTraceExporter&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="nx"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addSpanProcessor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;BatchSpanProcessor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;traceExporter&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

&lt;span class="nx"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;register&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// Application code&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;tracer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;opentelemetry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;trace&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getTracer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;deno-demo-tracer&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;port&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;8080&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;handler&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;Response&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// This call will be autoinstrumented&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;http://www.example.com&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;span&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;tracer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startSpan&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`constructBody`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`Your user-agent is:\n\n&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user-agent&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Unknown&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nx"&gt;span&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;end&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;serve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;instrument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;port&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// Helper code&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;instrument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

  &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;instrumentedHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;tracer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startActiveSpan&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;handler&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;span&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

      &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

      &lt;span class="nx"&gt;span&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;end&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;instrumentedHandler&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;It’s a lot! But let’s take a look at it piece-by-piece.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Boilerplate
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Autoinstrumentation
&lt;/h4&gt;

&lt;p&gt;This is the code for the fetch autoinstrumentation&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;registerInstrumentations&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;npm:@opentelemetry/instrumentation&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;FetchInstrumentation&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;npm:@opentelemetry/instrumentation-fetch&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="nf"&gt;registerInstrumentations&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;instrumentations&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;FetchInstrumentation&lt;/span&gt;&lt;span class="p"&gt;()],&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This monkeypatches the global &lt;code&gt;fetch&lt;/code&gt; so that all network calls made with it are instrumented (i.e. have data on them recorded in telemetry).&lt;/p&gt;

&lt;p&gt;It saves you the trouble of needing to instrument every &lt;code&gt;fetch&lt;/code&gt; call in application code yourself. And it also instruments &lt;code&gt;fetch&lt;/code&gt; calls your dependencies are making, which may have been otherwise impossible to track.&lt;/p&gt;

&lt;h4&gt;
  
  
  Span Processing and Exporting
&lt;/h4&gt;

&lt;p&gt;This is the code for setting up the span processing and exporting pipelines.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;BatchSpanProcessor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;ConsoleSpanExporter&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;npm:@opentelemetry/sdk-trace-base&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;OTLPTraceExporter&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;npm:@opentelemetry/exporter-trace-otlp-proto&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;consoleExporter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ConsoleSpanExporter&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="nx"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addSpanProcessor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;BatchSpanProcessor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;consoleExporter&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;traceExporter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OTLPTraceExporter&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="nx"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addSpanProcessor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;BatchSpanProcessor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;traceExporter&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Once a span has ended, it will be sent to a span processor, which will decide what to do with it and eventually pass it on to an exporter.&lt;/p&gt;

&lt;p&gt;In this case &lt;code&gt;BatchSpanProcessor&lt;/code&gt; is being used, meaning that spans are queued up in-memory and &lt;a href="https://github.com/open-telemetry/opentelemetry-js/blob/main/packages/opentelemetry-sdk-trace-base/src/export/BatchSpanProcessorBase.ts#L218" rel="noopener noreferrer"&gt;flushed in batches&lt;/a&gt; via a &lt;code&gt;setTimeout&lt;/code&gt; &lt;a href="https://github.com/open-telemetry/opentelemetry-js/blob/main/packages/opentelemetry-core/src/utils/environment.ts#L153" rel="noopener noreferrer"&gt;every 5 seconds&lt;/a&gt;.&lt;/p&gt;

&lt;h5&gt;
  
  
  Exporting to the Console and to a Tracing Vendor
&lt;/h5&gt;

&lt;p&gt;The first “endpoint” spans are exported to is the console, via the &lt;code&gt;ConsoleSpanExporter&lt;/code&gt;. This is useful to have for debugging purposes, especially when you’re not seeing traces show up in your vendor but you are seeing them in the console.&lt;/p&gt;

&lt;p&gt;The second endpoint spans are exported to is your tracing vendor (e.g. Honeycomb, NewRelic, etc...), via the &lt;code&gt;OTLPTraceExporter&lt;/code&gt;. It will depend on the vendor, but specifying your vendor’s remote endpoint and auth credentials as environment variables should be enough&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;

&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;OTEL_EXPORTER_OTLP_ENDPOINT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;your vendor HTTP OTLP ingest endpoint&amp;gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;OTEL_EXPORTER_OTLP_HEADERS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;vendor specific auth credentials&amp;gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This should enable the exporter code to send telemetry via OTLP (OpenTelemetry Line Protocol) over HTTP to your tracing vendor.&lt;/p&gt;

&lt;h3&gt;
  
  
  The App Code
&lt;/h3&gt;

&lt;h4&gt;
  
  
  OpenTelemetry’s API Surface
&lt;/h4&gt;

&lt;p&gt;Everything so far has been part of the OpenTelemetry SDK, and should not be touched directly by application code.&lt;/p&gt;

&lt;p&gt;Application code should only ever have to interact with the OpenTelemetry API. The API then hooks up to the SDK behind the scenes to do all of the things previously discussed.&lt;/p&gt;

&lt;p&gt;To create spans and other instrumentation, application code should use the OpenTelemetry API, as shown below.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;opentelemetry&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;npm:@opentelemetry/api&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;tracer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;opentelemetry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;trace&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getTracer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;deno-demo-tracer&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;span&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;tracer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startSpan&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`constructBody`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`Your user-agent is:\n\n&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user-agent&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Unknown&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nx"&gt;span&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;end&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;instrumentedHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;tracer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startActiveSpan&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;handler&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;span&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

      &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

      &lt;span class="nx"&gt;span&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;end&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;A key, subtle difference here is between &lt;code&gt;startSpan&lt;/code&gt; and &lt;code&gt;startActiveSpan.&lt;/code&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;startSpan&lt;/code&gt;

&lt;ul&gt;
&lt;li&gt;Creates a span, finds the currently “active” span, and adds the newly created span as a child of it&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;
&lt;code&gt;startActiveSpan&lt;/code&gt;

&lt;ul&gt;
&lt;li&gt;Creates a span, finds the currently “active” span, and adds the newly created span as a child of it.&lt;/li&gt;
&lt;li&gt;Makes the new span the “active” span, so all spans created in the function passed in the 2nd parameter will be added as child spans of it&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;The difference is that &lt;code&gt;startActiveSpan&lt;/code&gt; makes a new parent by default, whereas &lt;code&gt;startSpan&lt;/code&gt; does not.&lt;/p&gt;

&lt;p&gt;In order to have a span created by &lt;code&gt;startSpan&lt;/code&gt; become “active” and the parent of any child spans, you have to do this&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ctx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;opentelemetry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;trace&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setSpan&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="nx"&gt;opentelemetry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;active&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
  &lt;span class="nx"&gt;wantsToBeAParentSpan&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Then calling &lt;code&gt;startSpan&lt;/code&gt; afterwards will create the new spans as children of &lt;code&gt;wantsToBeAParentSpan&lt;/code&gt;.&lt;/p&gt;

&lt;h4&gt;
  
  
  Handling Concurrent Requests and the Need for Async Context
&lt;/h4&gt;

&lt;p&gt;Imagine 2 requests (say request A and request B) come in near simultaneously. Both create their own parent spans (say parent span A and parent span B) and both end up waiting on the asynchronous fetch call to resolve at the same time.&lt;/p&gt;

&lt;p&gt;Say request A’s fetch call finishes first. How does the &lt;code&gt;tracer.startSpan&lt;/code&gt; call know to attach itself to parent span A and not parent span B?&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;handler&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;Response&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;http://www.example.com&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;span&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;tracer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startSpan&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`constructBody`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`Your user-agent is:\n\n&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user-agent&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Unknown&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nx"&gt;span&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;end&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;We need some way to keep the context of the request across asynchronous events, so that &lt;code&gt;tracer.startSpan&lt;/code&gt; can know that this is still for request A, and it should make the new span a child of parent span A.&lt;/p&gt;

&lt;p&gt;OpenTelemetry-JS handles this differently based on the situation&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Browser - Uses zone.js to keep track of these asynchronous contexts&lt;/li&gt;
&lt;li&gt;Node.js - Uses AsyncLocalStorage to keep track of asynchronous contexts&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Thanks to Deno’s Node.js compatibility efforts all that’s needed to benefit from the Node.js async context management approach is to use the Node SDK&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;NodeTracerProvider&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;npm:@opentelemetry/sdk-trace-node&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;provider&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;NodeTracerProvider&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This will automatically make sure &lt;code&gt;tracer.startSpan&lt;/code&gt; attaches the span it creates to the correct parent span in the above situation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Future Directions
&lt;/h2&gt;

&lt;p&gt;This example is just scratching the surface of using OpenTelemetry in Deno. Here are some other ways to take it moving forward.&lt;/p&gt;

&lt;h3&gt;
  
  
  Adding Structured Data to Spans
&lt;/h3&gt;

&lt;p&gt;One thing the example didn't highlight is the ability to add structured data to spans, just like you would with logs (&lt;a href="https://opentelemetry.io/docs/instrumentation/js/instrumentation/#attributes" rel="noopener noreferrer"&gt;this OTEL doc&lt;/a&gt; covers how to do this in more detail).&lt;/p&gt;

&lt;p&gt;In certain use cases, you could potentially get away with only using traces and not using logs at all.&lt;/p&gt;

&lt;h3&gt;
  
  
  To Deploy and Beyond
&lt;/h3&gt;

&lt;p&gt;As of this writing, Deno Deploy doesn’t support &lt;code&gt;npm:&lt;/code&gt; specifiers so it’s not possible to use the OTEL example above there.&lt;/p&gt;

&lt;p&gt;But once that lands it should be good to go (!)&lt;/p&gt;

&lt;h3&gt;
  
  
  Actually Knowing CPU Time per Request
&lt;/h3&gt;

&lt;p&gt;Deno Deploy (and other edge function services) limit or bill based on the milliseconds of CPU time your functions use.&lt;/p&gt;

&lt;p&gt;However, there’s no easy way to profile or measure this today (as far as I’m aware).&lt;/p&gt;

&lt;p&gt;With OpenTelemetry traces, you should be able to drop in spans in CPU-intensive parts of your code to zone in on what’s eating up CPU time.&lt;/p&gt;

&lt;p&gt;And with autoinstrumentation, you could even measure the underlying framework you’re using too.&lt;/p&gt;

&lt;h3&gt;
  
  
  Autoinstrumenting Deno-specific APIs and Libraries
&lt;/h3&gt;

&lt;p&gt;There are many &lt;a href="https://deno.land/api@v1.31.1" rel="noopener noreferrer"&gt;Deno APIs&lt;/a&gt; outside of &lt;code&gt;fetch&lt;/code&gt; that could probably benefit from being autoinstrumented (e.g. Cache), in particular the ones that don’t already have a browser autoinstrumentation package available (i.e. for Deno-specific APIs, like filesystem ones).&lt;/p&gt;

&lt;p&gt;There are also a wide variety of Deno-specific libraries (e.g. oak, Fresh) that could possibly benefit from being autoinstrumented too, either via a separate autoinstrumentation library or by getting built-in to the library itself.&lt;/p&gt;

&lt;h3&gt;
  
  
  Putting the “Distributed” in Distributed Tracing
&lt;/h3&gt;

&lt;p&gt;This example was only of 1 service, but you could imagine running several different distributed services as well.&lt;/p&gt;

&lt;p&gt;In that case, trace context propagation across the network (where Service B is aware of the parent span Service A created before sending Service B the request, so Service B can add its spans to the correct parent) is critical in creating 1 unbroken distributed trace across all services.&lt;/p&gt;

&lt;p&gt;I haven't yet tried to see if this just works out-of-the-box, but I’m guessing it may need some effort to get firing on all cylinders.&lt;/p&gt;

&lt;h3&gt;
  
  
  All the Pillars
&lt;/h3&gt;

&lt;p&gt;This example was just of tracing, but OpenTelemetry also covers metrics and logging as well (enabling cool things like finding trace exemplars that are contributing to a metric).&lt;/p&gt;

&lt;p&gt;It would be interesting to see if those work out-of-the-box for Deno too.&lt;/p&gt;

&lt;h2&gt;
  
  
  Parting Thoughts
&lt;/h2&gt;

&lt;p&gt;I personally didn't anticipate that the Node.js compat work done by the Deno team would impact OpenTelemetry support, so this came as a pleasant surprise to me. I have no idea what that involved but hats off to the Deno folks for their efforts.&lt;/p&gt;

&lt;p&gt;And I hope there is more to come in making Deno the best server-side JS runtime around for production workloads!&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;a href="https://en.wikipedia.org/wiki/Deno_(software)" rel="noopener noreferrer"&gt;https://en.wikipedia.org/wiki/Deno_(software)&lt;/a&gt; is where I got the definition of Deno from&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://opentelemetry.io/" rel="noopener noreferrer"&gt;https://opentelemetry.io/&lt;/a&gt; is where I got the definition of OpenTelemetry from&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://opentelemetry.io/docs/instrumentation/js/instrumentation/" rel="noopener noreferrer"&gt;https://opentelemetry.io/docs/instrumentation/js/instrumentation/&lt;/a&gt; and the remaining OTEL-JS docs have a lot of information on what more you can do with spans and how to go about doing it&lt;/li&gt;
&lt;li&gt;For more tips on troubleshooting beyond ConsoleSpanExporter, including turning on OpenTelemetry-JS's diagnostic Debug-level logging, see &lt;a href="https://opentelemetry.io/docs/instrumentation/js/getting-started/nodejs/#troubleshooting" rel="noopener noreferrer"&gt;https://opentelemetry.io/docs/instrumentation/js/getting-started/nodejs/#troubleshooting&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/open-telemetry/opentelemetry-js/issues/2293#issuecomment-1485395549" rel="noopener noreferrer"&gt;As of this writing, Deno supports Node.js's AsyncLocalStorage API for async context management except for setTimeout&lt;/a&gt; per Luca Casonato&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/grunet/can-you-measure-the-duration-of-a-promise-3a6h"&gt;This older article I wrote collects some notes on the history of async context management in JS&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Console Output
&lt;/h2&gt;

&lt;p&gt;For the curious, here is the output that the &lt;code&gt;ConsoleSpanExporter&lt;/code&gt; generates after the server handles a request&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;

&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;traceId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;9a9a625e1562e9847ffe97e09fcf1bea&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;parentId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;f80a96f14dc334d8&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;traceState&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;undefined&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;constructBody&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;5a1974974e18fcf6&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1680829418679000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;duration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;195&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;attributes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{},&lt;/span&gt;
  &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;code&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;events&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt;
  &lt;span class="na"&gt;links&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;traceId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;9a9a625e1562e9847ffe97e09fcf1bea&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;parentId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;undefined&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;traceState&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;undefined&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;handler&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;f80a96f14dc334d8&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1680829418663000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;duration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;16340&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;attributes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{},&lt;/span&gt;
  &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;code&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;events&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt;
  &lt;span class="na"&gt;links&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;traceId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;9a9a625e1562e9847ffe97e09fcf1bea&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;parentId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;f80a96f14dc334d8&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;traceState&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;undefined&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;HTTP GET&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;7ec3993d6f40671c&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1680829418669000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;duration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;12000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;attributes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;component&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;fetch&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;http.method&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;GET&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;http.url&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;http://www.example.com/&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;http.status_code&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;http.status_text&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;OK&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;http.host&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;www.example.com&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;http.scheme&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;http&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;http.user_agent&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Deno/1.32.1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;code&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;events&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt;
  &lt;span class="na"&gt;links&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;traceId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;5086d5f4412c11a533d81044998ac7d6&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;parentId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;undefined&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;traceState&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;undefined&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;HTTP POST&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;698b36fbc35e09c9&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1680829423694000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;duration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;94000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;attributes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;component&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;fetch&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;http.method&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;http.url&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://api.honeycomb.io/v1/traces&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;http.status_code&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;http.status_text&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;OK&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;http.host&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;api.honeycomb.io&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;http.scheme&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;http.user_agent&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Deno/1.32.1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;code&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;events&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt;
  &lt;span class="na"&gt;links&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>deno</category>
      <category>javascript</category>
      <category>observability</category>
      <category>webdev</category>
    </item>
    <item>
      <title>How to Think About Software Supply Chain Security - Part 2</title>
      <dc:creator>Grunet</dc:creator>
      <pubDate>Sat, 01 Apr 2023 13:55:43 +0000</pubDate>
      <link>https://dev.to/grunet/how-to-think-about-software-supply-chain-security-part-2-964</link>
      <guid>https://dev.to/grunet/how-to-think-about-software-supply-chain-security-part-2-964</guid>
      <description>&lt;p&gt;(If you haven’t already, read or skim &lt;a href="https://dev.to/grunet/how-to-think-about-software-supply-chain-security-28eg"&gt;Part 1 of this series&lt;/a&gt; first for background.)&lt;/p&gt;

&lt;h2&gt;
  
  
  Tracking Confidence and Risk
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--PNXM9GfG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/nr06vaxcwz3d5tbmqwjs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--PNXM9GfG--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/nr06vaxcwz3d5tbmqwjs.png" alt="It looks like a histogram of a narrow bell curve, rotated 90 degrees clockwise so the base is at the far left and the tip is at the far right. Inside the bell curve near the left is the text High Confidence in a large font size. Inside the tip near the right is the text Low Confidence in a small font size. To the right of each bar of the histogram is a phrase. The phrases are the heading level threes in the article below. Each one is a software supply chain security risk, and the idea is they're chipping away at confidence. There's an arrow at the bottom indicating the risks to the left occur closer to the development phase, whereas the risks to the right occur closer to the deployment to production phase, roughly speaking." width="880" height="383"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;There are an extremely large number of software supply chain security risks. Each one of these risks can reduce confidence in the security of the software development process. &lt;/p&gt;

&lt;p&gt;When someone has an idea for a software change, it’s at its most secure. Peoples’ brains cannot be infected or manipulated that easily (by software).&lt;/p&gt;

&lt;p&gt;However the change then has to go through design, then development, then validation, and ultimately deployment or release. At each stage there are a multitude of software supply chain security risks that can erode confidence. &lt;/p&gt;

&lt;p&gt;If left unmitigated, these risks can add up to a complete loss of confidence in the integrity of the final product and the security of the process itself.&lt;/p&gt;

&lt;h2&gt;
  
  
  Diving Deeper Into Individual Risks
&lt;/h2&gt;

&lt;p&gt;What follows is a brief exposition of each of the risks included in the above diagram.&lt;/p&gt;

&lt;p&gt;Note that what the diagram covers is only a small sample of all software supply chain security risks.&lt;/p&gt;

&lt;h3&gt;
  
  
  Lack of 2FA on Github
&lt;/h3&gt;

&lt;p&gt;If your Github password gets compromised, an attacker can now act as you.&lt;/p&gt;

&lt;p&gt;For example, they might write a Github workflow to exfiltrate all of your build-time secrets.&lt;/p&gt;

&lt;p&gt;Enforcing 2FA on all user accounts mitigates this risk.&lt;/p&gt;

&lt;h3&gt;
  
  
  Long-Lived Github Personal Access Tokens
&lt;/h3&gt;

&lt;p&gt;If a PAT (personal access token) gets compromised, an attacker now has access to all of the allowed permissions of the token and the Github account the PAT is from, regardless of 2FA.&lt;/p&gt;

&lt;p&gt;For example, they could use the PAT to steal all of your confidential source code, and then use that information in a subsequent attack.&lt;/p&gt;

&lt;p&gt;There is no general mitigation for this (that I can think of) outside of avoiding use of PATs.&lt;/p&gt;

&lt;h3&gt;
  
  
  Signed Commits Not Required
&lt;/h3&gt;

&lt;p&gt;If signed commits aren’t required in your Github repository or aren't in use by your team, an attacker who has already compromised some Github account can modify your commits after they’ve been made.&lt;/p&gt;

&lt;p&gt;For example, they might modify a commit you had previously made on a PR and that a reviewer had already reviewed, sneaking in subtle runtime secrets exfiltration code.&lt;/p&gt;

&lt;p&gt;Requiring signed commits in your repository eliminates this risk. And if you use Github Codespaces, your commits will automatically be signed.&lt;/p&gt;

&lt;h3&gt;
  
  
  Main Branch Not Protected
&lt;/h3&gt;

&lt;p&gt;If the main branch of your repository isn’t protected, an attacker who has already compromised some Github account can directly commit changes to it without anyone noticing.&lt;/p&gt;

&lt;p&gt;For example, they could add in some subtle runtime secrets exfiltration code.&lt;/p&gt;

&lt;p&gt;Protecting the main branch eliminates this risk.&lt;/p&gt;

&lt;h3&gt;
  
  
  Code Review Not Required
&lt;/h3&gt;

&lt;p&gt;If code review is not required in your repository, an attacker who has already compromised some Github account can make a PR (pull request) and merge it into main all by themselves.&lt;/p&gt;

&lt;p&gt;For example, they could add in some subtle runtime secrets exfiltration code via the PR.&lt;/p&gt;

&lt;p&gt;Requiring code review on all PRs eliminates this risk.&lt;/p&gt;

&lt;h3&gt;
  
  
  Builds Not Fully Automated using a CI Service
&lt;/h3&gt;

&lt;p&gt;If builds are not automated in an ephemeral environment, then malware lingering in the environment can infect the builds.&lt;/p&gt;

&lt;p&gt;For example, if a container image is built on someone’s computer, existing malware running on that computer could modify what’s included in the container image, inserting a backdoor for when it’s running in production later on.&lt;/p&gt;

&lt;p&gt;Using a build service like Github Actions prevents this possibility from happening, since the build is run on a new, clean VM (virtual machine) each time.&lt;/p&gt;

&lt;h3&gt;
  
  
  Dependencies’ Versions Not Pinned
&lt;/h3&gt;

&lt;p&gt;A dependency could be compromised by an attacker and a new malicious version of the dependency published. &lt;/p&gt;

&lt;p&gt;If dependencies aren’t pinned, the next build will pull in the malicious version of the dependency.&lt;/p&gt;

&lt;p&gt;Pinning the dependency ensures the same code is used each time unless someone explicitly chooses to change it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Dependencies Neither Cached in CI Nor Vendored
&lt;/h3&gt;

&lt;p&gt;If every build fetches dependencies from the internet, then that increases the chances of pulling in an existing version of a dependency that’s been corrupted.&lt;/p&gt;

&lt;p&gt;Caching dependencies in CI helps reduce the number of times fetching from the internet is required. Vendoring dependencies (i.e. including their code in your source code) erases this problem altogether.&lt;/p&gt;

&lt;h3&gt;
  
  
  Dependencies Updated Too Often
&lt;/h3&gt;

&lt;p&gt;If you update your dependencies anytime a dependency publishes a new version, you’re at increased risk that one of those dependency updates has been compromised and you'll now be pulling in its malicious code.&lt;/p&gt;

&lt;p&gt;Keeping your dependencies up-to-date only whenever there’s a new major version, while also taking all security patches is one way to achieve a safer balance.&lt;/p&gt;

&lt;h3&gt;
  
  
  Secrets Not Restricted to Protected Branches
&lt;/h3&gt;

&lt;p&gt;If you have Github Secrets that are accessible outside of protected branches, anyone (e.g. a disgruntled employee) can write a Github Workflow in a throwaway branch to exfiltrate those secrets.&lt;/p&gt;

&lt;p&gt;Environment-based secrets in Github Actions can restrict secrets to protected branches and eliminate this risk.&lt;/p&gt;

&lt;h3&gt;
  
  
  Secrets Not Least Permissioned
&lt;/h3&gt;

&lt;p&gt;If secrets include things like AWS access keys, and the permissions behind them are very broad (e.g. Adminstrator-level on an AWS account) if/when the secrets are exfiltrated, an attacker will have wide access to your AWS accounts.&lt;/p&gt;

&lt;p&gt;Restricting these kinds of secrets to the least permissions required to perform their functions (e.g. only enough for pushing to a container registry) is one mitigation.&lt;/p&gt;

&lt;p&gt;Another (imo easier) mitigation if the tool supports it is to use ephemeral access keys via Github’s OIDC provider. These can be configured so the access keys only exist for a few minutes, so even if they are exfiltrated they are hard to abuse.&lt;/p&gt;

&lt;h3&gt;
  
  
  Outbound Requests Made during CI
&lt;/h3&gt;

&lt;p&gt;Disallowing outbound requests during CI prevents any malicious code that’s already infiltrated your CI environment from exfiltrating any secrets. (this is similar to the idea of “air gapping” a build)&lt;/p&gt;

&lt;p&gt;In practice this can be difficult to pull off, or even monitor for.&lt;/p&gt;

&lt;h3&gt;
  
  
  Deploys/Releases Not Automated
&lt;/h3&gt;

&lt;p&gt;If deployments or releases aren’t automated from an ephemeral environment, then malware lingering in the environment can affect the deployment or release.&lt;/p&gt;

&lt;p&gt;For example, if changes to your cloud IAC (infrastructure as code) are done from someone’s computer, existing malware on their computer could include extra cloud resources into changesets (e.g. cryptominers).&lt;/p&gt;

&lt;p&gt;Outside of fully automating deploys or releases, one mitigation for this is to do the process from a clean machine (e.g. Cloud Shell in AWS or GCP) every time.&lt;/p&gt;

&lt;h3&gt;
  
  
  Production Access Outside of Deployment/Release Branch
&lt;/h3&gt;

&lt;p&gt;This is the same as “Secrets Not Restricted to Protected Branches” for the special (and much worse) case that the secrets contains access credentials to your production environments.&lt;/p&gt;

&lt;p&gt;The same mitigation about using Environments in Github Actions to restrict the branches the secrets are accessible from applies here.&lt;/p&gt;

&lt;h2&gt;
  
  
  Takeaways
&lt;/h2&gt;

&lt;p&gt;Supply chain security is hard to grok because it requires you to be skeptical about things you also have to trust heavily. This tension is difficult to grapple with.&lt;/p&gt;

&lt;p&gt;Thinking about things from a risk-first (or equivalently confidence-first) perspective has proven useful to me in dealing with this tension.&lt;/p&gt;

&lt;p&gt;To take a deeper dive into the world of software supply chain security, check out &lt;a href="https://slsa.dev/"&gt;https://slsa.dev/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>security</category>
    </item>
    <item>
      <title>Modern Accessibility</title>
      <dc:creator>Grunet</dc:creator>
      <pubDate>Sat, 25 Mar 2023 23:26:43 +0000</pubDate>
      <link>https://dev.to/grunet/modern-accessibility-51kn</link>
      <guid>https://dev.to/grunet/modern-accessibility-51kn</guid>
      <description>&lt;h2&gt;
  
  
  Guessing at What the State of the Art in Web Accessibility Will Look Like in 5 Years
&lt;/h2&gt;

&lt;p&gt;Web design and development evolve at a rapid pace. It's reasonable to assume web accessibility will evolve similarly over the next few years.&lt;/p&gt;

&lt;p&gt;This is my attempt to guess at what the best organizations will be doing for it in 5 years time (inspired by &lt;a href="https://www.moderntesting.org/"&gt;the Modern Testing principles&lt;/a&gt; and the &lt;a href="https://www.devops-research.com/research.html"&gt;DevOps Research and Assessment studies&lt;/a&gt;).&lt;/p&gt;

&lt;h2&gt;
  
  
  Protecting Abled Usage and Optimizing for Experimentation with Disabled Experiences
&lt;/h2&gt;

&lt;p&gt;There are 2 high-level aspects to my guess at what the top organizations will be doing&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Protecting Abled (i.e. non-disabled) Experiences&lt;/li&gt;
&lt;li&gt;Optimizing for Experimentation with Disabled Experiences &lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Protecting Abled Experiences
&lt;/h3&gt;

&lt;p&gt;Before organizations can focus on optimizing their disabled users' experiences, they'll have to first make sure they avoid impacting or regressing their abled users' experiences.&lt;/p&gt;

&lt;h4&gt;
  
  
  Avoidance of Designs that Cannot Be Made Accessible Without Affecting Abled Experiences
&lt;/h4&gt;

&lt;p&gt;At the design level, this means avoiding design patterns  that cannot be made accessible without changing how abled users experience them.&lt;/p&gt;

&lt;p&gt;For example, take ephemeral toast notifications. There's no way to make these accessible after-the-fact without creating a drawer containing all of the notifications. If there's no room for such a drawer in the design without impacting the abled experience, the design can't be made accessible.&lt;/p&gt;

&lt;p&gt;For a contrary example, take icon buttons that end up not having accessible labels. They can be given accessible labels without needing to adjust the abled experience at all. The design can be made accessible without impacting the abled experience.&lt;/p&gt;

&lt;h4&gt;
  
  
  Functional and Visual Regression Tests of Abled Workflows Derived From Telemetry
&lt;/h4&gt;

&lt;p&gt;Automated functional tests protecting abled user flows can give teams experimenting with accessibility changes extra confidence that their changes won't break abled use cases.&lt;/p&gt;

&lt;p&gt;Layering on automated visual regression tests can enhance that confidence to another level.&lt;/p&gt;

&lt;p&gt;And deriving the tests from production telemetry makes sure that the right abled user flows are being encoded into automation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Optimizing for Experimentation with Disabled Experiences
&lt;/h3&gt;

&lt;p&gt;All of that protection should enable teams to experiment at will when it comes to disabled user experiences, without fear of side-effects.&lt;/p&gt;

&lt;h4&gt;
  
  
  Lead Times on the Order of Seconds
&lt;/h4&gt;

&lt;p&gt;A key prerequisite to this is having an extremely short time in-between having an idea for an experiment and getting real feedback on it.&lt;/p&gt;

&lt;p&gt;Bringing down lead times to a few seconds helps to this end. Tests can be shifted to run as synthetic monitors against production to help with this.&lt;/p&gt;

&lt;h4&gt;
  
  
  Living in Production
&lt;/h4&gt;

&lt;p&gt;At this point, teams will be effectively "living in production" and can focus on experimentation.&lt;/p&gt;

&lt;h5&gt;
  
  
  Experiment-first Mentality
&lt;/h5&gt;

&lt;p&gt;At the end of the day, the only people who can tell if an experience is accessible are the disabled users who experience it. No amount of prior team experience or knowledge can substitute for this.&lt;/p&gt;

&lt;p&gt;Teams will construct experiments on how to improve disabled user experiences and measure them through several means in production. The successful experiments will live on, and the team will continue to iterate via experiments.&lt;/p&gt;

&lt;h5&gt;
  
  
  Analytics-first Mentality
&lt;/h5&gt;

&lt;p&gt;Teams will leverage anonymized, aggregated metrics derived from their analytics that serve as indicators for disabled user experiences.&lt;/p&gt;

&lt;p&gt;A common tactic will be to compare metrics (e.g. conversion rates) for disabled user groups against abled user groups. If the disabled user groups are performing more poorly than their abled counterparts, it will indicate more experimentation is needed.&lt;/p&gt;

&lt;h5&gt;
  
  
  Zero-Effort Generation of Strong Ties to the Disability Community
&lt;/h5&gt;

&lt;p&gt;Analytics alone won't generate useful enough information. Teams will leverage 3rd parties to connect them with their disabled users so they can apply user research techniques to better understand their disabled users and drive their future experiments.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;My guess is the organizations that will be doing the best at accessibility will be the ones that work the closest with their disabled users, and then optimize all of their processes towards experimenting to find the best solutions for those users.&lt;/p&gt;

&lt;p&gt;(Slack already does parts of this today to my understanding, which is why it doesn't seem too farfetched to me.)&lt;/p&gt;

</description>
      <category>a11y</category>
      <category>webdev</category>
      <category>programming</category>
    </item>
    <item>
      <title>How to Think About Software Supply Chain Security - Part 1</title>
      <dc:creator>Grunet</dc:creator>
      <pubDate>Sat, 25 Mar 2023 02:54:08 +0000</pubDate>
      <link>https://dev.to/grunet/how-to-think-about-software-supply-chain-security-28eg</link>
      <guid>https://dev.to/grunet/how-to-think-about-software-supply-chain-security-28eg</guid>
      <description>&lt;h2&gt;
  
  
  How Software Supply Chain Security Differs from Normal Security
&lt;/h2&gt;

&lt;p&gt;With normal security, the concern is primarily with malicious, external actors probing your software looking for direct exploits.&lt;/p&gt;

&lt;p&gt;With software supply chain security, the concern is more about malicious actors exploiting backdoors in the creation of your software.&lt;/p&gt;

&lt;p&gt;The terminology is different too. Whereas it makes sense to talk about "trust" in the context of normal security, that concept loses usefulness in the context of software supply chain security.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to be Concerned about When it Comes to Software Supply Chain Security
&lt;/h2&gt;

&lt;p&gt;There are 2 main areas to be concerned about&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Artifact Integrity&lt;/li&gt;
&lt;li&gt;Exfiltration&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Artifact Integrity
&lt;/h3&gt;

&lt;p&gt;Artifact integrity has to do with trying to make sure that the software that was delivered and/or is running in production is actually what was intended to be made.&lt;/p&gt;

&lt;p&gt;An example of when this would fail is when a malicious actor is able to modify the source code used to build a container, including a backdoor that lets them collect sensitive user information.&lt;/p&gt;

&lt;h3&gt;
  
  
  Exfiltration
&lt;/h3&gt;

&lt;p&gt;Exfiltration in this context is concerned with trying to make sure sensitive parts of the supply chains themselves aren't stolen (e.g. source code, build secrets).&lt;/p&gt;

&lt;p&gt;An example of when this has happened is the &lt;a href="https://circleci.com/blog/jan-4-2023-incident-report/"&gt;CircleCI security incident&lt;/a&gt; from a few months ago, where all customer build secrets were compromised by a malicious actor.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to Think About the Techniques That Address The Concerns
&lt;/h2&gt;

&lt;p&gt;There are many techniques available to address these two concerns, but one helpful way to categorize them is how they impact the risks involved. The 3 most prominent categories are&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Risk Elimination&lt;/li&gt;
&lt;li&gt;Risk Mitigation&lt;/li&gt;
&lt;li&gt;Risk Awareness&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Risk Elimination
&lt;/h3&gt;

&lt;p&gt;These techniques eviscerate certain classes of risk altogether. For example,&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Signing all Git operations (e.g. commits, tags)&lt;/li&gt;
&lt;li&gt;Automating builds and running them in an isolated, ephemeral environment&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The former prevents malicious actors from impersonating valid contributors.&lt;/p&gt;

&lt;p&gt;The latter prevents any long-lived malicious software from living in the build environment.&lt;/p&gt;

&lt;h3&gt;
  
  
  Risk Mitigation
&lt;/h3&gt;

&lt;p&gt;These techniques reduce the chances of certain classes of risk. For example,&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Peer review of code changes&lt;/li&gt;
&lt;li&gt;Using a dedicated build service&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The former mitigates the case of 1 disgruntled employee trying to submit malicious code, but it does nothing for the case of 2 disgruntled employees colluding to submit malicious code.&lt;/p&gt;

&lt;p&gt;The latter will generally improve the security of the secrets kept inside the build service. However, as the CircleCI incident showed, all build service platforms are still fallible when it comes to secrets exfiltration.&lt;/p&gt;

&lt;h3&gt;
  
  
  Risk Awareness
&lt;/h3&gt;

&lt;p&gt;These techniques give you more insight into the risk profile of certain classes of risk. For example,&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Gathering all manifests (e.g. Software Bills of Material, SBOMs) of all of your dependencies&lt;/li&gt;
&lt;li&gt;Checking Reddit before updating a dependency in case there's a well-known compromise in flight&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The former helps increase awareness of all the pieces comprising the software, as well as their individual security vulnerabilities (notice how this overlaps with normal security concerns). &lt;/p&gt;

&lt;p&gt;The latter will help you become aware before merging a Dependabot dependency update PR that may contain malicious code.&lt;/p&gt;

&lt;h2&gt;
  
  
  Takeaways
&lt;/h2&gt;

&lt;p&gt;Risk is the primitive to use when thinking about software supply chain security.&lt;/p&gt;

&lt;p&gt;Thinking about the risk a technique or tool impacts can make it easier to reason about.&lt;/p&gt;

&lt;p&gt;For more on this, including more practical examples, check out &lt;a href="https://dev.to/grunet/how-to-think-about-software-supply-chain-security-part-2-964"&gt;Part 2 of this series&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;To take a deeper dive into the world of software supply chain security, check out &lt;a href="https://slsa.dev/"&gt;https://slsa.dev/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>security</category>
    </item>
    <item>
      <title>How to Maximize User Privacy When Using Google Analytics 4</title>
      <dc:creator>Grunet</dc:creator>
      <pubDate>Sat, 04 Mar 2023 23:42:57 +0000</pubDate>
      <link>https://dev.to/grunet/how-to-maximize-user-privacy-when-using-google-analytics-4-4cd7</link>
      <guid>https://dev.to/grunet/how-to-maximize-user-privacy-when-using-google-analytics-4-4cd7</guid>
      <description>&lt;h2&gt;
  
  
  What is Google Analytics 4?
&lt;/h2&gt;

&lt;p&gt;Web analytics is the practice of gathering information about how your users are using your websites, for the purposes of marketing, sales, improving product offerings, etc...&lt;/p&gt;

&lt;p&gt;Google Analytics is one such tool that aids in this practice. It is by far the most popular one.&lt;/p&gt;

&lt;p&gt;Google Analytics 4 is the latest iteration of the tool. It was created in large part as a response to GDPR (General Data Protection Regulation) privacy legislation in the EU (European Union) that Universal Analytics ("Google Analytics 3") couldn't support.&lt;/p&gt;

&lt;h2&gt;
  
  
  How does Google Analytics 4 Treat Privacy By Default?
&lt;/h2&gt;

&lt;p&gt;Despite the increased focus on privacy, it doesn't look great. &lt;/p&gt;

&lt;p&gt;It has several defaults that are unnecessary for ordinary website analytics, exposing way more information than needed back to Google. &lt;/p&gt;

&lt;h2&gt;
  
  
  Steps to Take to Maximize Your Users' Privacy
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Turn Off All Account Data Sharing Settings
&lt;/h3&gt;

&lt;p&gt;By default, you're opted in to sharing your users' analytics data with these 4 other entities&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Google Products &amp;amp; Services&lt;/li&gt;
&lt;li&gt;Modeling Contributions &amp;amp; Business Insights&lt;/li&gt;
&lt;li&gt;Technical Support&lt;/li&gt;
&lt;li&gt;Account Specialists (aka Google salespeople)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Uncheck all of them during the onboarding flow.&lt;/p&gt;

&lt;h3&gt;
  
  
  Use "Device-based" for Reporting Identity
&lt;/h3&gt;

&lt;p&gt;Reporting Identity refers to how Google Analytics tracks your users across different websites and different devices.&lt;/p&gt;

&lt;p&gt;By default this is set to "Blended", which includes&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Google signals (aka tracking your users based on their having logged into their Google account on any browser or device)&lt;/li&gt;
&lt;li&gt;Modeling (&lt;a href="https://support.google.com/analytics/answer/11161109"&gt;aka using machine learning based on users who accepted tracking to infer behaviors of users who declined tracking, so they can be tracked...&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Both of these are overkill for basic website analytics, and overreach into your users' privacy.&lt;/p&gt;

&lt;p&gt;The best alternative is actually hidden. You have to hit "Show All" to uncover the "Device-based" option.&lt;/p&gt;

&lt;p&gt;Go to Admin, then Account Access Management, then Reporting Identity, then hit "Show All", then select "Device-based".  &lt;/p&gt;

&lt;h4&gt;
  
  
  Actually Make the "Device-based" Choice Useful
&lt;/h4&gt;

&lt;p&gt;Even with this choice, &lt;a href="https://support.google.com/analytics/answer/11593727"&gt;Google is still able to track your users across your site by setting a first-party cookie&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;While the cookie may be first-party, Google is most certainly not. Any tracking at this level should at best be done by solutions that let you be the steward of your users' data, not Google.&lt;/p&gt;

&lt;p&gt;To stop this tracking, you need to effectively deny tracking automatically on behalf of your users (as if they'd automatically denied all such tracking via your cookie notice)&lt;/p&gt;

&lt;p&gt;The details vary by platform and integration, but &lt;a href="https://developers.google.com/tag-platform/devguides/consent#implementation_example"&gt;seem to be eventually findable in the docs (e.g. for web)&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What You're Left With
&lt;/h2&gt;

&lt;p&gt;After all that, Google Analytics 4 should be a tool that&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Captures anonymous analytics about how your users are using your site&lt;/li&gt;
&lt;li&gt;Lets you add your own custom anonymous instrumentation to capture events it doesn't by default&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Or at least I hope so...&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;Google Analytics 4 is not an analytics tool. It's an advertising and marketing tool.&lt;/p&gt;

&lt;p&gt;That's the only framing I can make that explains why its defaults are the way they are.&lt;/p&gt;

</description>
      <category>analytics</category>
      <category>privacy</category>
    </item>
  </channel>
</rss>
