<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: John McBride</title>
    <description>The latest articles on DEV Community by John McBride (@jpmcb).</description>
    <link>https://dev.to/jpmcb</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1105969%2F5af7806f-98cb-4f2f-b32a-7b650a4e8fd7.jpeg</url>
      <title>DEV Community: John McBride</title>
      <link>https://dev.to/jpmcb</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/jpmcb"/>
    <language>en</language>
    <item>
      <title>OpenSauced on Azure: Lessons learned from a near-zero downtime migration</title>
      <dc:creator>John McBride</dc:creator>
      <pubDate>Tue, 15 Oct 2024 16:50:08 +0000</pubDate>
      <link>https://dev.to/opensauced/opensauced-on-azure-lessons-learned-from-a-near-zero-downtime-migration-40b9</link>
      <guid>https://dev.to/opensauced/opensauced-on-azure-lessons-learned-from-a-near-zero-downtime-migration-40b9</guid>
      <description>&lt;p&gt;At the beginning of October, the OpenSauced engineering team completed a weeks-long&lt;br&gt;
migration of our infrastructure, data, and pipelines to Microsoft Azure. Before this move, we had several bespoke container Apps on Digital Ocean alongside managed PostgreSQL databases.&lt;/p&gt;

&lt;p&gt;This setup worked well for a while and was a great way to bootstrap. But, because we lacked GitOps, infrastructure-as-code (IaC) tooling, and a structured method for storing secrets in those early days, our app configurations could be brittle, prone to breaking during upgrades or releases, and difficult to scale in a streamlined manner.&lt;/p&gt;

&lt;p&gt;We ultimately decided to migrate our core backend infrastructure from DigitalOcean to Azure, consolidating everything into a unified environment. This move allowed us to capitalize on our existing Azure Kubernetes Service (AKS) infrastructure and fully commit to Kubernetes as our primary service and container orchestration platform.&lt;/p&gt;

&lt;h3&gt;
  
  
  Azure Kubernetes Service for container runtimes
&lt;/h3&gt;

&lt;p&gt;If you've read any of my previous engineering deep dives (including &lt;a href="https://opensauced.pizza/blog/technical-deep-dive:-how-we-built-the-pizza-cli-using-go-and-cobra" rel="noopener noreferrer"&gt;Technical Deep Dive: How We Built the Pizza CLI Using Go and Cobra&lt;/a&gt;, &lt;a href="https://opensauced.pizza/blog/how-we-use-kubernetes-jobs-to-scale-openssf-scorecard" rel="noopener noreferrer"&gt;How we use Kubernetes jobs to scale OpenSSF Scorecard&lt;/a&gt;, and &lt;a href="https://opensauced.pizza/blog/how-we-saved-thousands-of-dollars-deploying-low-cost-open-source-ai-technologies" rel="noopener noreferrer"&gt;How We Saved 10s of Thousands of Dollars Deploying Low Cost Open Source AI Technologies At Scale with Kubernetes&lt;/a&gt;), you know that we already deploy several AI services and core data pipelines on AKS (primarily the services that power StarSearch).&lt;/p&gt;

&lt;p&gt;To simplify our infrastructure and make the most of our existing compute resources in our AKS clusters, we adopted a "monolithic cluster" approach. This means we’re deploying all infrastructure, APIs, and services to the same AKS clusters, centralizing control, management, deployment, and scaling.&lt;/p&gt;

&lt;p&gt;The benefits are clear: we avoid the complexity of multi-cluster management, consolidate our networking within a single region, and streamline operations for our small, agile engineering team.&lt;/p&gt;

&lt;p&gt;However, this approach has trade-offs we may need to tackle in the future. As OpenSauced grows and scales, we’ll need to reassess and likely adopt a multi-region or multi-cluster strategy to support a globally distributed network. This decision was made with a conscious understanding of the scalability challenges we may face in the future, but for now, this approach gives us the flexibility and simplicity we need.&lt;/p&gt;

&lt;h3&gt;
  
  
  Choosing a Kubernetes Ingress controller
&lt;/h3&gt;

&lt;p&gt;With AKS now handling all our backend infrastructure, including public-facing APIs, we needed an ingress solution for routing external traffic into our clusters. This also required load balancing, firewall management, Let's Encrypt certificates for SSL, and security policies.&lt;/p&gt;

&lt;p&gt;We chose Traefik as our Kubernetes ingress controller. Traefik, a popular choice in the Kubernetes community, is an "application proxy" that offers a rich set of features while being easy to set up. With Traefik, what could have been a complex, error-prone task became an intuitive and streamlined integration into our infrastructure.&lt;/p&gt;

&lt;h3&gt;
  
  
  Using Pulumi for infrastructure as code and deployment
&lt;/h3&gt;

&lt;p&gt;A key part of our migration was adopting Pulumi as our infrastructure-as-code solution. Before this, our infrastructure setup was a bit ad-hoc, with various configurations and third-party services stitched together manually. When we needed a new cloud service or we were ready to deploy some new API service, we'd piece-meal the different bits together in cloud dashboards and build some custom automation in GitHub actions. While this worked in the very early stages of OpenSauced, it quickly became brittle and hard to manage at scale or across an engineering team.&lt;/p&gt;

&lt;p&gt;Pulumi offers several benefits that have already had a noticeable impact on our workflows and engineering culture:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Environment Reproducibility: We can easily create and replicate environments, whether spinning up a new Kubernetes cluster or a full staging environment. It’s as simple as creating a new Pulumi stack.&lt;/li&gt;
&lt;li&gt;Simple, Consistent Deployments: Deployments are straightforward, repeatable, and integrated into our CI/CD pipelines.&lt;/li&gt;
&lt;li&gt;State and Secret Management: Pulumi provides a built-in mechanism for storing state and secrets, which can be securely shared across the entire engineering team.&lt;/li&gt;
&lt;li&gt;GitOps Compatibility: By leveraging Pulumi’s tight integration with Git, we can adopt deeper GitOps workflows, bringing more automation and consistency to our infrastructure management.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Overall, Pulumi has significantly reduced the friction around infrastructure management and deploying new services, allowing us to focus on what really matters — building OpenSauced!&lt;/p&gt;

&lt;h3&gt;
  
  
  Azure Flexible servers for managed Postgres
&lt;/h3&gt;

&lt;p&gt;For the data layer at OpenSauced (including user data, user assets, and GitHub repository metadata), we previously used DigitalOcean’s managed PostgreSQL service. For our migration to Azure, we opted for Azure Database for PostgreSQL with the Flexible Server deployment option.&lt;/p&gt;

&lt;p&gt;This service gives us all the benefits of a managed database solution, including automated backups, restoration capabilities, and high availability. The bonus here is that we can co-locate our data with our AKS clusters in the same region, ensuring low-latency networking between our services on-cluster and the database.&lt;/p&gt;

&lt;p&gt;Looking ahead, as our user base grows, we’ll need to explore data replication and distribution to additional regions to enhance availability and redundancy. But for now, this managed solution meets our needs and positions us well for future scalability.&lt;/p&gt;

&lt;p&gt;Hats off to the Azure Postgres team on enabling a smooth and near zero downtime migration of our data. All in all, using Azure's provided migration tools, moving everything over took less than 5 minutes. We completed the production migration with minimal end user impact. Because we used Pulumi to configure all our containers on-cluster and also deploy the Postgres flexible servers, we could quickly and easily re-deploy our containers with different configurations to be ready to use the new databases.&lt;/p&gt;

&lt;p&gt;Between our Kubernetes environment, Pulumi IaC tooling, and Azure's sublime migration tools, we were able to complete a full production migration seamlessly.&lt;/p&gt;

&lt;h3&gt;
  
  
  Grafana Observability
&lt;/h3&gt;

&lt;p&gt;As part of this migration, we also made some enhancements to our observability stack to ensure that our backend infrastructure is properly monitored. We use Grafana for observability, and during the migration, we deployed Grafana Alloy on our clusters. Alloy integrates seamlessly with Prometheus for metrics and Loki for log aggregation, giving us a powerful observability framework.&lt;/p&gt;

&lt;p&gt;With these tools in place, we have a comprehensive view of our system’s health, allowing us to monitor performance, detect anomalies, and respond to issues before they impact our users. Additionally, our integration with Grafana’s on-call and alerting features enable our engineering team to respond to incidents and ensure OpenSauced stays healthy.&lt;/p&gt;




&lt;p&gt;A huge thank you to our Microsoft Azure partners in enabling us to make this transition, providing their expertise, and supporting us along the way!!&lt;/p&gt;

&lt;p&gt;As always, stay saucy friends!!&lt;/p&gt;

</description>
      <category>azure</category>
      <category>kubernetes</category>
      <category>infrastructureascode</category>
    </item>
    <item>
      <title>Technical Deep Dive: How We Built the Pizza CLI Using Go and Cobra</title>
      <dc:creator>John McBride</dc:creator>
      <pubDate>Mon, 23 Sep 2024 16:07:21 +0000</pubDate>
      <link>https://dev.to/opensauced/technical-deep-dive-how-we-built-the-pizza-cli-using-go-and-cobra-oad</link>
      <guid>https://dev.to/opensauced/technical-deep-dive-how-we-built-the-pizza-cli-using-go-and-cobra-oad</guid>
      <description>&lt;p&gt;Last week, &lt;a href="https://opensauced.pizza/blog/introducing-the-pizza-cli" rel="noopener noreferrer"&gt;the OpenSauced engineering team released the Pizza CLI&lt;/a&gt;, a powerful and composable command-line tool for generating CODEOWNER files and integrating with the OpenSauced platform. Building robust command-line tools may seem straightforward, but without careful planning and thoughtful paradigms, CLIs can quickly become tangled messes of code that are difficult to maintain and riddled with bugs. In this blog post, we'll take a deep dive into how we built this CLI using Go, how we organize our commands using Cobra, and how our lean engineering team iterates quickly to build powerful functionality.&lt;/p&gt;

&lt;h2&gt;
  
  
  Using Go and Cobra
&lt;/h2&gt;

&lt;p&gt;The Pizza CLI is a Go command-line tool that leverages several standard libraries. Go’s simplicity, speed, and systems programming focus make it an ideal choice for building CLIs. At its core, the Pizza-CLI uses &lt;a href="https://github.com/spf13/cobra" rel="noopener noreferrer"&gt;spf13/cobra&lt;/a&gt;, a CLI bootstrapping library in Go, to organize and manage the entire tree of commands.&lt;/p&gt;

&lt;p&gt;You can think of Cobra as the scaffolding that makes a command-line interface itself work, enables all the flags to function consistently, and handles communicating to users via help messages and automated documentation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Structuring the Codebase
&lt;/h3&gt;

&lt;p&gt;One of the first (and biggest) challenges when building a Cobra-based Go CLI is how to structure all your code and files. Contrary to popular belief, there is &lt;strong&gt;&lt;em&gt;no&lt;/em&gt;&lt;/strong&gt; prescribed way to do this in Go. Neither the &lt;code&gt;go build&lt;/code&gt; command nor the &lt;code&gt;gofmt&lt;/code&gt; utility will complain about how you name your packages or organize your directories. This is one of the best parts of Go: its simplicity and power make it easy to define structures that work for you and your engineering team!&lt;/p&gt;

&lt;p&gt;Ultimately, in my opinion, it's best to think of and structure a Cobra-based Go codebase as a tree of commands:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;├── Root command
│   ├── Child command
│   ├── Child command
│   │   └── Grandchild command
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At the base of the tree is the root command: this is the anchor for your entire CLI application and will get the name of your CLI. Attached as child commands, you’ll have a tree of branching logic that informs the structure of how your entire CLI flow works.&lt;/p&gt;

&lt;p&gt;One of the things that’s incredibly easy to miss when building CLIs is the user experience. I typically recommend people follow a “root verb noun” paradigm when building commands and child-command structures since it flows logically and leads to excellent user experiences.&lt;/p&gt;

&lt;p&gt;For example, in &lt;a href="https://kubernetes.io/docs/reference/kubectl/" rel="noopener noreferrer"&gt;Kubectl&lt;/a&gt;, you’ll see this paradigm everywhere: “kubectl get pods”, “kubectl apply …“, or “kubectl label pods …” This ensures a sensical flow to how users will interact with your command line application and helps a lot when talking about commands with other people.&lt;/p&gt;

&lt;p&gt;In the end, this structure and suggestion can inform how you organize your files and directories, but again, ultimately it’s up to you to determine how you structure your CLI and present the flow to end-users.&lt;/p&gt;

&lt;p&gt;In the Pizza CLI, we have a well defined structure where child commands (and subsequent grandchildren of those child commands) live. Under the &lt;code&gt;cmd&lt;/code&gt; directory in their own packages, each command gets its own implementation. The root command scaffolding exists in a &lt;code&gt;pkg/utils&lt;/code&gt; directory since it's useful to think of the root command as a top level utility used by &lt;code&gt;main.go&lt;/code&gt;, rather than a command that might need a lot of maintenance. Typically, in your root command Go implementation, you’ll have a lot of boilerplate setting things up that you won’t touch much so it’s nice to get that stuff out of the way.&lt;/p&gt;

&lt;p&gt;Here's a simplified view of our directory structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;├── main.go
├── pkg/
│   ├── utils/
│   │   └── root.go
├── cmd/
│   ├── Child command dir
│   ├── Child command dir
│   │   └── Grandchild command dir
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This structure allows for clear separation of concerns and makes it easier to maintain and extend the CLI as it grows and as we add more commands.&lt;/p&gt;

&lt;h2&gt;
  
  
  Using go-git
&lt;/h2&gt;

&lt;p&gt;One of the main libraries we use in the Pizza-CLI is the &lt;a href="https://github.com/go-git/go-git" rel="noopener noreferrer"&gt;go-git&lt;/a&gt; library, a pure git implementation in Go that is highly extensible. During &lt;code&gt;CODEOWNERS&lt;/code&gt; generation, this library enables us to iterate the git ref log, look at code diffs, and determine which git authors are associated with the configured attributions defined by a user.&lt;/p&gt;

&lt;p&gt;Iterating the git ref log of a local git repo is actually pretty simple:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// 1. Open the local git repository&lt;/span&gt;
&lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;git&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PlainOpen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/path/to/your/repo"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nb"&gt;panic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"could not open git repository"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// 2. Get the HEAD reference for the local git repo&lt;/span&gt;
&lt;span class="n"&gt;head&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Head&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nb"&gt;panic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"could not get repo head"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// 3. Create a git ref log iterator based on some options&lt;/span&gt;
&lt;span class="n"&gt;commitIter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;git&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LogOptions&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;From&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;  &lt;span class="n"&gt;head&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Hash&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nb"&gt;panic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"could not get repo log iterator"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;commitIter&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c"&gt;// 4. Iterate through the commit history&lt;/span&gt;
&lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;commitIter&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ForEach&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;commit&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;object&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Commit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c"&gt;// process each commit as the iterator iterates them&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nb"&gt;panic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"could not process commit iterator"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you’re building a Git based application, I definitely recommend using go-git: it’s fast, integrates well within the Go ecosystem, and can be used to do all sorts of things!&lt;/p&gt;

&lt;h2&gt;
  
  
  Integrating Posthog telemetry
&lt;/h2&gt;

&lt;p&gt;Our engineering and product team is deeply invested in bringing the best possible command line experience to our end users: this means we’ve taken steps to integrate anonymized telemetry that can report to Posthog on usage and errors out in the wild. This has allowed us to fix the most important bugs first, iterate quickly on popular feature requests, and understand how our users are using the CLI.&lt;/p&gt;

&lt;p&gt;Posthog has &lt;a href="https://github.com/PostHog/posthog-go" rel="noopener noreferrer"&gt;a first party library in Go&lt;/a&gt; that supports this exact functionality. First, we define a Posthog client:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="s"&gt;"github.com/posthog/posthog-go"&lt;/span&gt;

&lt;span class="c"&gt;// PosthogCliClient is a wrapper around the posthog-go client and is used as a&lt;/span&gt;
&lt;span class="c"&gt;// API entrypoint for sending OpenSauced telemetry data for CLI commands&lt;/span&gt;
&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;PosthogCliClient&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// client is the Posthog Go client&lt;/span&gt;
    &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="n"&gt;posthog&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Client&lt;/span&gt;

    &lt;span class="c"&gt;// activated denotes if the user has enabled or disabled telemetry&lt;/span&gt;
    &lt;span class="n"&gt;activated&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt;

    &lt;span class="c"&gt;// uniqueID is the user's unique, anonymous identifier&lt;/span&gt;
    &lt;span class="n"&gt;uniqueID&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then, after initializing a new client, we can use it through the various struct methods we’ve defined. For example, when logging into the OpenSauced platform, we capture specific information on a successful login:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// CaptureLogin gathers telemetry on users who log into OpenSauced via the CLI&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;PosthogCliClient&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;CaptureLogin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;username&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;activated&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Enqueue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;posthog&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Capture&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;DistinctId&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;username&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;Event&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;      &lt;span class="s"&gt;"pizza_cli_user_logged_in"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;During command execution, the various “capture” functions get called to capture error paths, happy paths, etc.&lt;/p&gt;

&lt;p&gt;For the anonymized IDs, we use &lt;a href="https://github.com/google/uuid" rel="noopener noreferrer"&gt;Google’s excellent UUID Go library&lt;/a&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;newUUID&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;uuid&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These UUIDs get stored locally on end users machines as JSON under their home directory: &lt;code&gt;~/.pizza-cli/telemtry.json&lt;/code&gt;. This gives the end user complete authority and autonomy to delete this telemetry data if they want (or disable telemetry altogether through configuration options!) to ensure they’re staying anonymous when using the CLI.&lt;/p&gt;

&lt;h2&gt;
  
  
  Iterative Development and Testing
&lt;/h2&gt;

&lt;p&gt;Our lean engineering team follows an iterative development process, focusing on delivering small, testable features rapidly. Typically, we do this through GitHub issues, pull requests, milestones, and projects. We use Go's built-in testing framework extensively, writing unit tests for individual functions and integration tests for entire commands.&lt;/p&gt;

&lt;p&gt;Unfortunately, Go’s standard testing library doesn’t have great assertion functionality out of the box. It’s easy enough to use “==” or other operands, but most of the time, when going back and reading through tests, it’s nice to be able to eyeball what’s going on with assertions like “assert.Equal” or “assert.Nil”.&lt;/p&gt;

&lt;p&gt;We’ve integrated the excellent &lt;a href="https://github.com/stretchr/testify" rel="noopener noreferrer"&gt;testify library&lt;/a&gt; with its “assert” functionality to allow for smoother test implementation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;LoadConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nonExistentPath&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;require&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;assert&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Nil&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Using Just
&lt;/h2&gt;

&lt;p&gt;We heavily use &lt;a href="https://github.com/casey/just" rel="noopener noreferrer"&gt;Just&lt;/a&gt; at OpenSauced, a command runner utility, much like GNU’s “make”, for easily executing small scripts. This has enabled us to quickly onramp new team members or community members to our Go ecosystem since building and testing is as simple as “just build” or “just test”! &lt;/p&gt;

&lt;p&gt;For example, to create a simple build utility in Just, within a justfile, we can have:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;build:
  go build main.go -o build/pizza
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Which will build a Go binary into the build/ directory. Now, building locally is as simple as executing a “just” command.&lt;/p&gt;

&lt;p&gt;But we’ve been able to integrate more functionality into using Just and have made it a cornerstone of how our entire build, test, and development framework is executed. For example, to build a binary for the local architecture with injected build time variables (like the sha the binary was built against, the version, the date time, etc.), we can use the local environment and run extra steps in the script before executing the “go build”:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;build:
    #!/usr/bin/env sh
  echo "Building for local arch"

  export VERSION="${RELEASE_TAG_VERSION:-dev}"
  export DATETIME=$(date -u +"%Y-%m-%d-%H:%M:%S")
  export SHA=$(git rev-parse HEAD)

  go build \
    -ldflags="-s -w \
    -X 'github.com/open-sauced/pizza-cli/pkg/utils.Version=${VERSION}' \
    -X 'github.com/open-sauced/pizza-cli/pkg/utils.Sha=${SHA}' \
    -X 'github.com/open-sauced/pizza-cli/pkg/utils.Datetime=${DATETIME}' \
    -X 'github.com/open-sauced/pizza-cli/pkg/utils.writeOnlyPublicPosthogKey=${POSTHOG_PUBLIC_API_KEY}'" \
    -o build/pizza
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We’ve even extended this to enable cross architecture and OS build: Go uses the GOARCH and GOOS env vars to know which CPU architecture and operating system to build against. To build other variants, we can create specific Just commands for that:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Builds for Darwin linux (i.e., MacOS) on arm64 architecture (i.e. Apple silicon)
build-darwin-arm64:
  #!/usr/bin/env sh

  echo "Building darwin arm64"

  export VERSION="${RELEASE_TAG_VERSION:-dev}"
  export DATETIME=$(date -u +"%Y-%m-%d-%H:%M:%S")
  export SHA=$(git rev-parse HEAD)
  export CGO_ENABLED=0
  export GOOS="darwin"
  export GOARCH="arm64"

  go build \
    -ldflags="-s -w \
    -X 'github.com/open-sauced/pizza-cli/pkg/utils.Version=${VERSION}' \
    -X 'github.com/open-sauced/pizza-cli/pkg/utils.Sha=${SHA}' \
    -X 'github.com/open-sauced/pizza-cli/pkg/utils.Datetime=${DATETIME}' \
    -X 'github.com/open-sauced/pizza-cli/pkg/utils.writeOnlyPublicPosthogKey=${POSTHOG_PUBLIC_API_KEY}'" \
    -o build/pizza-${GOOS}-${GOARCH}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Building the Pizza CLI using Go and Cobra has been an exciting journey and we’re thrilled to share it with you. The combination of Go's performance and simplicity with Cobra's powerful command structuring has allowed us to create a tool that's not only robust and powerful, but also user-friendly and maintainable.&lt;/p&gt;

&lt;p&gt;We invite you to explore the &lt;a href="https://github.com/open-sauced/pizza-cli" rel="noopener noreferrer"&gt;Pizza CLI GitHub repository&lt;/a&gt;, try out the tool, and &lt;a href="https://github.com/orgs/open-sauced/discussions/categories/general-feedback-or-bugs" rel="noopener noreferrer"&gt;let us know your thoughts&lt;/a&gt;. Your feedback and contributions are invaluable as we work to make code ownership management easier for development teams everywhere!&lt;/p&gt;

</description>
      <category>go</category>
    </item>
    <item>
      <title>Introducing the Pizza CLI</title>
      <dc:creator>John McBride</dc:creator>
      <pubDate>Mon, 16 Sep 2024 15:13:37 +0000</pubDate>
      <link>https://dev.to/opensauced/introducing-the-pizza-cli-1g6f</link>
      <guid>https://dev.to/opensauced/introducing-the-pizza-cli-1g6f</guid>
      <description>&lt;p&gt;As software engineering teams and projects scale, a common problem larger organizations can find themselves in is deciphering the “who’s who” of a codebase. This problem only compounds itself if large mono-repos are in use or multiple teams interact in the same space. Developers may find themselves asking “Who do I ask for a review on this? What team owns this part of the code base? Who do I ask for help?”&lt;/p&gt;

&lt;p&gt;On the receiving end of this, you may find yourself getting asked questions on things you haven’t been involved in for years. Or you may be missing critical notifications on pieces of code you maintain or your team maintains. Or even worse, there may be cross team miscommunications that cause problems for the code you own.&lt;/p&gt;

&lt;p&gt;As teams grow and codebases expand, before too long, it can be very easy to lose the thread on who owns what piece of “knowledge” across your engineering org. This lost context can slow down development, hinder open collaboration, and slowly erode the engineering culture of an engineering organization.&lt;/p&gt;

&lt;p&gt;But what if there was a way to automate code ownership, streamline collaboration, and reduce this knowledge debt? Today, the OpenSauced team is very excited to introduce the Pizza CLI, a powerful command-line tool designed to help maintainers, teams, and organizations manage their engineering “ownership” culture and derive insights right on the command line.&lt;/p&gt;

&lt;h2&gt;
  
  
  Introducing the Pizza CLI
&lt;/h2&gt;

&lt;p&gt;The Pizza CLI is our solution to the challenges of lost context and asking “Who’s who?” Born from discussions with industry experts like &lt;a href="https://x.com/bdougieYO/" rel="noopener noreferrer"&gt;Bdougie&lt;/a&gt; and &lt;a href="https://twitter.com/kelseyhightower" rel="noopener noreferrer"&gt;Kelsey Hightower&lt;/a&gt;, inspired by robust tools used by hyperscale tech companies across the industry, the Pizza CLI empowers teams to automate code ownership and enhance cross org collaboration directly from the command line.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;CODEOWNERS Generation:&lt;/em&gt;&lt;/strong&gt; Easily generate GitHub-style CODEOWNERS or Google-style OWNERS files, granularly mapping out who owns which parts of your codebase based on git history, number of lines touched, and current activity. These owner files can then be used in &lt;a href="https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners" rel="noopener noreferrer"&gt;GitHub CODEOWNERS automation&lt;/a&gt; or as part of more robust CI/CD.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Attribution Configuration:&lt;/em&gt;&lt;/strong&gt; Use a simple YAML configuration to map commit emails to GitHub usernames and teams, ensuring accurate ownership assignments and easy management of entities within your engineering org.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;OpenSauced Integration:&lt;/em&gt;&lt;/strong&gt; Seamlessly connect with &lt;a href="https://opensauced.pizza" rel="noopener noreferrer"&gt;OpenSauced&lt;/a&gt; to create Contributor Insights pages and metrics, helping you visualize and understand your project's contributor and owner landscape.&lt;/p&gt;

&lt;h2&gt;
  
  
  Enhanced collaboration for large teams
&lt;/h2&gt;

&lt;p&gt;By clearly defining code ownership in a granular manner and integrating seamlessly with GitHub CODEOWNERS functionality, the Pizza CLI helps you:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Improve Efficiency:&lt;/em&gt;&lt;/strong&gt; Developers know exactly who to reach out to for code reviews or questions, reducing delays, miscommunications, and team misalignments.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Enhance Collaboration:&lt;/em&gt;&lt;/strong&gt; Ownership transparency creates a culture of shared responsibility and teamwork, further enhancing open “inner-source” between teams and developers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Streamline Onboarding:&lt;/em&gt;&lt;/strong&gt; New team members can quickly identify code owners, making it easier for them to ramp up and contribute confidently. Oftentimes, this can be automated through GitHub’s integration with CODEOWNER files through automatic PR review requests and notifications.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting Started with Pizza CLI
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Installation
&lt;/h3&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;

brew &lt;span class="nb"&gt;install &lt;/span&gt;open-sauced/tap/pizza


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;We offer a number of flexible options for installing the Pizza CLI onto your system (including Homebrew, NPM, Docker, and more). Check out the docs in the repository for a full rundown of the ways you can install this tool.&lt;/p&gt;

&lt;h3&gt;
  
  
  Validate your install
&lt;/h3&gt;

&lt;p&gt;Check to make sure you can run the Pizza CLI:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;

pizza version


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This will print out the version you have installed. Successfully running this in your terminal means you have successfully installed the CLI!&lt;/p&gt;

&lt;h3&gt;
  
  
  Generate a config
&lt;/h3&gt;

&lt;p&gt;Before you can start generating codeowner files, you’ll need a YAML configuration that attributes git commit emails with GitHub user logins. You can generate a config through the ”pizza generate config” command:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

pizza generate config /path/to/your/git/repo -i


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The “-i” flag tells the Pizza CLI to use “interactive” mode. This iterates through the git ref-log and looks at commits and who authored them. It will then ask you to attribute those commit emails to individuals on your team:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7evvc4prifgseqzirpkj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7evvc4prifgseqzirpkj.png" alt="image"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once you’ve finished generating your config, you’ll see that a &lt;code&gt;.sauced.yaml&lt;/code&gt; file is now in your git repo where you originally pointed the pizza command to. It’s been populated with the associated logins and emails that can be used to attribute changes in the repository to individual owners.&lt;/p&gt;

&lt;p&gt;We encourage you to commit this file to your repository as a core piece of configuration “infrastructure” denoting what individuals have what attributions associated with them. Alternatively, if exposing commit emails in the config is not acceptable, you may choose to add it to a private secret store and pull it down manually to make code attributions for individuals on your teams.&lt;/p&gt;

&lt;p&gt;You can also attribute GitHub teams to long lists of emails in your YAML configuration:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;

&lt;span class="na"&gt;attribution&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="c1"&gt;# Keys may also be GitHub team names.&lt;/span&gt;
  &lt;span class="na"&gt;open-sauced/engineering&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;john@opensauced.pizza&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;other-user@email.com&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;other-user@no-reply.github.com&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This way, multiple people can be associated to a single team within your configuration and will get the same attribution to anyone who is a code owner for those files. In other words, this is a powerful way to compose your teams and manage ownership across a whole team of engineers. &lt;/p&gt;

&lt;h3&gt;
  
  
  Generate a CODEOWNERS file
&lt;/h3&gt;

&lt;p&gt;Now that you have a config, you can generate a CODEOWNERS file:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;

pizza generate codeowners /path/to/your/git/repo


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This will read the &lt;code&gt;.sauced.yaml&lt;/code&gt; configuration file that you generated in your repo to know which git commit emails are associated with GitHub logins or teams.&lt;/p&gt;

&lt;p&gt;The codeowners generation iterates the git-ref log and looks at the number of lines touched and the frequency of updates from individuals. It will find the top 3 codeowners per file who’ve done the most work within the configured time range (note: you can use the –range flag to change how far back to look in the git ref-log!)&lt;/p&gt;

&lt;p&gt;We’re very excited to be bringing this tool to you! Knowing who owns what and how to get help from other teams can improve your workflow and minimize bottlenecks. Using tools that help you do that, can make it easier to connect with people when it matters most.&lt;/p&gt;

&lt;p&gt;Be sure to &lt;a href="https://github.com/open-sauced/pizza-cli" rel="noopener noreferrer"&gt;check out the open source Pizza CLI repository&lt;/a&gt; for a full rundown of everything that’s possible with the Pizza CLI. And feel free to ask any questions or give us feedback in the GitHub issues!!&lt;/p&gt;

&lt;p&gt;As always, stay saucy!&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>codeowners</category>
      <category>cli</category>
    </item>
    <item>
      <title>How we use Kubernetes jobs to scale OpenSSF Scorecard</title>
      <dc:creator>John McBride</dc:creator>
      <pubDate>Thu, 08 Aug 2024 15:13:26 +0000</pubDate>
      <link>https://dev.to/opensauced/how-we-use-kubernetes-jobs-to-scale-openssf-scorecard-5bf2</link>
      <guid>https://dev.to/opensauced/how-we-use-kubernetes-jobs-to-scale-openssf-scorecard-5bf2</guid>
      <description>&lt;p&gt;&lt;a href="https://opensauced.pizza/blog/introducing-openssf-scorecard-for-opensauced" rel="noopener noreferrer"&gt;We recently released integrations with the OpenSSF Scorecard&lt;/a&gt; on the OpenSauced platform. The OpenSSF Scorecard is a powerful Go command line interface that anyone can use to begin understanding the security posture of their projects and dependencies. It runs several checks for dangerous workflows, CICD best practices, if the project is still maintained, and much more. This enables software builders and consumers to understand their overall security picture, deduce if a project is safe to use, and where improvements to security practices need to be made.&lt;/p&gt;

&lt;p&gt;But one of our goals with integrating the OpenSSF Scorecard into the OpenSauced platform was to make this available to the broader open source ecosystem at large. If it’s a repository on GitHub, we wanted to be able to display a score for it. This meant scaling the Scorecard CLI to target nearly any repository on GitHub. Much easier said than done!&lt;/p&gt;

&lt;p&gt;In this blog post, let’s dive into how we did that using Kubernetes and what technical decisions we made with implementing this integration.&lt;/p&gt;

&lt;p&gt;We knew that we would need to build a cron type microservice that would frequently update scores across a myriad of repositories: the true question was how we would do that. It wouldn’t make sense to run the scorecard CLI ad-hoc: the platform could too easily get overwhelmed and we wanted to be able to do deeper analysis on scores across the open source ecosystem, even if the OpenSauced repo page hasn’t been visited recently. Initially, we looked at using the Scorecard Go library as direct dependent code and running scorecard checks within a single, monolithic microservice. We also considered using serverless jobs to run one off scorecard containers that would give back the results for individual repositories.&lt;/p&gt;

&lt;p&gt;The approach we ended up landing on, which marries simplicity, flexibility, and power, is to use Kubernetes Jobs at scale, all managed by a “scheduler” Kubernetes controller microservice. Instead of building a deeper code integration with scorecard, running one off Kubernetes Jobs gives us the same benefits of using a serverless approach, but with reduced cost since we’re managing it all directly on our Kubernetes cluster. Jobs also offer alot of flexibility in how they run: they can have long, extended timeouts, they can use disk, and like any other Kubernetes paradigm, they can have multiple pods doing different tasks. &lt;/p&gt;

&lt;p&gt;Let’s break down the individual components of this system and see how they work in depth:&lt;/p&gt;

&lt;p&gt;The first and biggest part of this system is the “scorecard-k8s-scheduler”; a Kubernetes controller-like microservice that kicks off new jobs on-cluster. While this microservice follows many of the principles, patterns, and methods used when &lt;a href="https://kubernetes.io/docs/concepts/architecture/controller/" rel="noopener noreferrer"&gt;building a traditional Kubernetes controller or operator&lt;/a&gt;, it does not watch for or mutate custom resources on the cluster. Its function is to simply kick off Kubernetes Jobs that run the Scorecard CLI and gather finished job results.&lt;/p&gt;

&lt;p&gt;Let’s look first at the main control loop in the Go code. This microservice uses the Kubernetes Client-Go library to interface directly with the cluster the microservice is running on: this is often referred to as an on-cluster config and client. Within the code, after bootstrapping the on-cluster client, we poll for repositories in our database that need updating. Once some repos are found, we kick off Kubernetes jobs on individual worker “threads” that will wait for each job to finish.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// buffered channel, sort of like semaphores, for threaded working&lt;/span&gt;
&lt;span class="n"&gt;sem&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="nb"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;chan&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;numConcurrentJobs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// continuous control loop&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// blocks on getting semaphore off buffered channel&lt;/span&gt;
    &lt;span class="n"&gt;sem&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt; &lt;span class="no"&gt;true&lt;/span&gt;

    &lt;span class="k"&gt;go&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c"&gt;// release the hold on the channel for this Go routine when done&lt;/span&gt;
        &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="n"&gt;sem&lt;/span&gt;
        &lt;span class="p"&gt;}()&lt;/span&gt;

        &lt;span class="c"&gt;// grab repo needing update, start scorecard Kubernetes Job on-cluster,&lt;/span&gt;
        &lt;span class="c"&gt;// wait for results, etc. etc.&lt;/span&gt;

        &lt;span class="c"&gt;// sleep the configured amount of time to relieve backpressure&lt;/span&gt;
        &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;backoff&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}()&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This “infinite control loop” method, with a buffered channel, is a common way in Go to continuously do something but only using a configured number of threads. The number of concurrent Go funcs that are running at any one given time depends on what configured value the “numConcurrentJobs” variable has. This sets up the buffered channel to act as a worker pool or semaphore which denotes the number of concurrent Go funcs running at any one given time. Since the buffered channel is a shared resource that all threads can use and inspect, I often like to think of this as a semaphore: a resource, much like a mutex, that multiple threads can attempt to lock on and access. In our production environment, we’ve scaled the number of threads in this scheduler all running at once. Since the actual scheduler isn’t very computationally heavy and will just kick off jobs and wait for results to eventually surface, we can push the envelope of what this scheduler can manage. We also have a built-in backoff system that attempts to relieve pressure when needed: this system will increment the configured “backoff” value if there are errors or if there are no repos found to go calculate the score for. This ensures we’re not continuously slamming our database with queries and the scorecard scheduler itself can remain in a “waiting” state, not taking up precious compute resources on the cluster.&lt;/p&gt;

&lt;p&gt;Within the control loop, we do a few things: first, we query our database for repositories needing their scorecard updated. This is a simple database query that is based on some timestamp metadata we watch for and have indexes on. Once a configured amount of time passes since the last score was calculated for a repo, it will bubble up to be crunched by a Kubernetes Job running the Scorecard CLI.&lt;/p&gt;

&lt;p&gt;Next, once we have a repo to get the score for, we kick off a Kubernetes Job using the “gcr.io/openssf/scorecard” image. Bootstrapping this job in Go code using Client-Go looks very similar to how it would look with yaml, just using the various libraries and apis available via “k8s.io” imports and doing it programmatically:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// defines the Kubernetes Job and its spec&lt;/span&gt;
&lt;span class="n"&gt;job&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;batchv1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Job&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// structs and details for the actual Job&lt;/span&gt;
    &lt;span class="c"&gt;// including metav1.ObjectMeta and batchv1.JobSpec&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// create the actual Job on cluster&lt;/span&gt;
&lt;span class="c"&gt;// using the in-cluster config and client&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;clientset&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;BatchV1&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Jobs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ScorecardNamespace&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metav1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CreateOptions&lt;/span&gt;&lt;span class="p"&gt;{})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After the job is created, we wait for it to signal it has completed or errored. Much like with kubectl, Client-Go offers a helpful way to “watch” resources and observe their state when they change:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// watch selector for the job name on cluster&lt;/span&gt;
&lt;span class="n"&gt;watch&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;clientset&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;BatchV1&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Jobs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ScorecardNamespace&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Watch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metav1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ListOptions&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;FieldSelector&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"metadata.name="&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;jobName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="c"&gt;// continuously pop off the watch results channel for job status&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;watch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ResultChan&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c"&gt;// wait for job success, error, or other states&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally, once we have a successful job completion, we can grab the results from the Job’s pod logs which will have the actual json results from the scorecard CLI! Once we have those results, we can upsert the scores back into the database and mutate any necessary metadata to signal to our other microservices or the OpenSauced API that there’s a new score!&lt;/p&gt;

&lt;p&gt;As mentioned before, the scorecard-k8s-scheduler can have any number of concurrent jobs running at once: in our production setting we have a large number of jobs running at once, all managed by this microservice. The intent is to be able to update scores every 2 weeks across all repositories on GitHub. With this kind of scale, we hope to be able to provide powerful tooling and insights to any open source maintainer or consumer!&lt;/p&gt;

&lt;p&gt;The “scheduler” microservice ends up being a small part of this whole system: anyone familiar with Kubernetes controllers knows that there are additional pieces of Kubernetes infrastructure that are needed to make the system work. In our case, we needed some role-based access control (RBAC) to enable our microservice to create Jobs on the cluster.&lt;/p&gt;

&lt;p&gt;First, we need a service account: this is the account that will be used by the scheduler and have access controls bound to it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ServiceAccount&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;scorecard-sa&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;scorecard-ns&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We place this service account in our “scorecard-ns” namespace where all this runs.&lt;/p&gt;

&lt;p&gt;Next, we need to have a role and role binding for the service account. This includes the actual access controls (including being able to create Jobs, view pod logs, etc.)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;rbac.authorization.k8s.io/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Role&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;scorecard-scheduler-role&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;scorecard-ns&lt;/span&gt;
&lt;span class="na"&gt;rules&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;apiGroups&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;batch"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;jobs"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;verbs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;create"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;delete"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;get"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;list"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;watch"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;patch"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;update"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;apiGroups&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pods"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pods/log"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;verbs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;get"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;list"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;watch"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="s"&gt;—&lt;/span&gt;

&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;rbac.authorization.k8s.io/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;RoleBinding&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;scorecard-scheduler-role-binding&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;scorecard-ns&lt;/span&gt;
&lt;span class="na"&gt;subjects&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ServiceAccount&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;scorecard-sa&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;scorecard-ns&lt;/span&gt;
&lt;span class="na"&gt;roleRef&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Role&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;scorecard-scheduler-role&lt;/span&gt;
  &lt;span class="na"&gt;apiGroup&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;rbac.authorization.k8s.io&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You might be asking yourself “Why do I need to give this service account access to get pods and pod logs? Isn’t that an over extension of the access controls?” Remember! Jobs have pods and in order to get the pod logs that have the actual results of the scorecard CLI, we must be able to list the pods from a job and then read their logs!&lt;/p&gt;

&lt;p&gt;The second part of this, the “RoleBinding”, is where we actually attach the Role to the service account. This service account can then be used when kicking off new jobs on the cluster.&lt;/p&gt;

&lt;p&gt;—&lt;/p&gt;

&lt;p&gt;Huge shout out to &lt;a href="https://github.com/alexellis" rel="noopener noreferrer"&gt;Alex Ellis&lt;/a&gt; and his excellent &lt;a href="https://github.com/alexellis/run-job" rel="noopener noreferrer"&gt;run-job&lt;/a&gt; controller: this was a huge inspiration and reference for correctly using Client-Go with Jobs!&lt;/p&gt;

&lt;p&gt;Stay saucy everyone!&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>go</category>
      <category>security</category>
    </item>
    <item>
      <title>Introducing OpenSSF Scorecard for OpenSauced</title>
      <dc:creator>John McBride</dc:creator>
      <pubDate>Tue, 06 Aug 2024 17:48:11 +0000</pubDate>
      <link>https://dev.to/opensauced/introducing-openssf-scorecard-for-opensauced-1ba7</link>
      <guid>https://dev.to/opensauced/introducing-openssf-scorecard-for-opensauced-1ba7</guid>
      <description>&lt;p&gt;In September of 2022, the European Parliament introduced the &lt;a href="https://digital-strategy.ec.europa.eu/en/policies/cyber-resilience-act" rel="noopener noreferrer"&gt;“Cyber Resilience Act”&lt;/a&gt;, commonly called the CRA: a new piece of legislation that requires anyone providing digital products in the EU to meet certain security and compliance requirements.&lt;/p&gt;

&lt;p&gt;But there’s a catch: before the CRA, companies providing or distributing software would often need to take on much of the risk when ensuring safe and reliable software was being shipped to end users. Now, software maintainers further down the supply chain will have to carry more of that weight. Not only may certain open source maintainers need to meet certain requirements, but they may have to provide an up to date security profile of their project.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.linuxfoundation.org/blog/understanding-the-cyber-resilience-act" rel="noopener noreferrer"&gt;As the Linux Foundation puts it&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The Act shifts much of the security burden onto those who develop software, as opposed to the users of software. This can be justified by two assumptions: first, software developers know best how to mitigate vulnerabilities and distribute patches; and second, it’s easier to mitigate vulnerabilities at the source than requiring users to do so.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;There’s a lot to unpack in the CRA. And it’s still not clear how individual open source projects, maintainers, foundations, or companies will be directly impacted. But, it’s clear that the broader open source ecosystem needs easier ways to understand the security risk of projects deep within dependency chains. With all that in mind, we are very excited to introduce the OpenSSF Scorecard ratings within the OpenSauced platform. &lt;/p&gt;

&lt;h2&gt;
  
  
  What is the OpenSSF Scorecard?
&lt;/h2&gt;

&lt;p&gt;The OpenSSF is &lt;a href="https://openssf.org/" rel="noopener noreferrer"&gt;the Open Source Security Foundation&lt;/a&gt;: a multidisciplinary group of software developers, industry leaders, security professionals, researchers, and government liaisons. The OpenSSF aims to enable the broader open source ecosystem “to secure open source software for the greater public good.” They interface with critical personnel across the software industry to fight for a safer technological future.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/ossf/scorecard" rel="noopener noreferrer"&gt;The OpenSSF Scorecard project&lt;/a&gt; is an effort to unify what best practices open source maintainers and consumers should use to judge if their code, practices, and dependencies are safe. Ultimately, the “scorecard” command line interface gives any the capability to inspect repositories, run “checks” against those repos, and derive an overall score for the risk profile of that project. It’s a very powerful software tool that gives you a general picture of where a piece of software is considered risky. It can also be a great starting point for any open source maintainer to develop better practices and find out where they may need to make improvements. By providing a standardized approach to assessing open source security and compliance, the Scorecard helps organizations more easily identify supply chain risks and regulatory requirements.&lt;/p&gt;

&lt;h2&gt;
  
  
  OpenSauced OpenOSSF Scorecards
&lt;/h2&gt;

&lt;p&gt;Using the scorecard command line interface as a cornerstone, we’ve built infrastructure and tooling to enable OpenSauced to capture scores for nearly all repositories on GitHub. Anything over a 6 or a 7 is generally considered safe to use with no blaring issues. Scores of 9 or 10 are doing phenomenally well. And projects with lower scores should be inspected closely to understand what’s gone wrong.&lt;/p&gt;

&lt;p&gt;Scorecards are enabled across all repositories. With this integration, we aim to make it easier for software maintainers to understand the security posture of their project and for software consumers to be assured that their dependencies are safe to use.&lt;/p&gt;

&lt;p&gt;Starting today, you can see the score for any project within individual &lt;a href="https://opensauced.pizza/docs/features/repo-pages/" rel="noopener noreferrer"&gt;Repository Pages&lt;/a&gt;. For example, in &lt;a href="https://app.opensauced.pizza/s/%20kubernetes/kubernetes" rel="noopener noreferrer"&gt;kubernetes/kubernetes&lt;/a&gt;, we can see the project is safe for use:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fgist.github.com%2Fuser-attachments%2Fassets%2F268078b3-4315-4354-a98b-835660f30023" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fgist.github.com%2Fuser-attachments%2Fassets%2F268078b3-4315-4354-a98b-835660f30023" alt="image"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let’s look at another example: &lt;a href="https://app.opensauced.pizza/s/crossplane/crossplane" rel="noopener noreferrer"&gt;crossplane/crossplane&lt;/a&gt;. These maintainers are doing an awesome job of ensuring they are following best practices for open source security and compliance!!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fgist.github.com%2Fuser-attachments%2Fassets%2Fc9c6c47c-ab2e-40e4-b884-4bd330f2c72e" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fgist.github.com%2Fuser-attachments%2Fassets%2Fc9c6c47c-ab2e-40e4-b884-4bd330f2c72e" alt="image"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The checks that the OpenSSF Scorecard looks for involves a wide range of common open source security practices, both “in code” and with the maintenance of the project: e.g. checking for code review best practices, if there are “dangerous workflows” present (like untrusted code being run and checked out during CI/CD runs), if the project is actively maintained, the use of signed releases, and many more.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Future of OpenSSF Scorecards at OpenSauced
&lt;/h2&gt;

&lt;p&gt;We plan to bring the OpenSSF Scorecard to more of the OpenSauced platform, as we aim to be the definitive place for open source security and compliance for maintainers and consumers. As part of that, we’ll be bringing more details to the OpenSSF Scorecard with how individual checks are ranked:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.sanity.io%2Fimages%2Fr7m53vrk%2Fproduction%2Fc2362e76a4d6f9a2beb5f3628ad38e381ed70f2d-808x790.png%3Fw%3D450" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.sanity.io%2Fimages%2Fr7m53vrk%2Fproduction%2Fc2362e76a4d6f9a2beb5f3628ad38e381ed70f2d-808x790.png%3Fw%3D450" alt="image"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We’ll also be bringing OpenSSF Scorecard to our premium offering, &lt;a href="https://opensauced.pizza/docs/features/workspaces/" rel="noopener noreferrer"&gt;Workspaces&lt;/a&gt;:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fgist.github.com%2Fuser-attachments%2Fassets%2Fc3f16287-ada5-4354-ac3f-532f1a7dbf1e" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fgist.github.com%2Fuser-attachments%2Fassets%2Fc3f16287-ada5-4354-ac3f-532f1a7dbf1e" alt="image"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Within a Workspace, you’ll soon be able to get an idea of how each of the projects you are tracking stack up alongside each other's score for open source security and compliance. You can use the OpenSSF Score together with all the Workspace insights and metrics, all in one single dashboard, to get a good idea of what’s happening within a set of repositories and what their security posture is. In this example, I’m tracking all the repositories within the bottlerocket-os org on GitHub, a security focused Linux based operating system: I can see that each of the repositories has a good rating which gives me greater confidence in the maintenance status and security posture of this ecosystem. This also enables stakeholders and maintainers of Bottlerocket to have a birds eye snapshot of the compliance and maintenance status of the  entire org.&lt;/p&gt;

&lt;p&gt;As the CRA and similar regulations push more of the security burden onto developers, tools like the OpenSSF Scorecard become invaluable. They offer a standardized, accessible way to assess and improve the security of open source projects, helping maintainers meet new compliance requirements and giving software consumers confidence in their choices. &lt;/p&gt;

&lt;p&gt;Looking ahead, we're committed to expanding these capabilities at OpenSauced. By providing comprehensive security insights, from individual repository scores to organization-wide overviews in Workspaces, we're working to create a more secure and transparent open source ecosystem, to enable anyone in the open source community to better understand their software dependencies, feel empowered to make a meaningful change if needed, and provide helpful tools to open source maintainers to better maintain their projects.&lt;/p&gt;

&lt;p&gt;Stay saucy!&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>security</category>
    </item>
    <item>
      <title>Understanding the Lottery Factor</title>
      <dc:creator>John McBride</dc:creator>
      <pubDate>Wed, 22 May 2024 22:08:11 +0000</pubDate>
      <link>https://dev.to/opensauced/understanding-the-lottery-factor-41gc</link>
      <guid>https://dev.to/opensauced/understanding-the-lottery-factor-41gc</guid>
      <description>&lt;p&gt;It’s 2:36am on a Sunday morning. You’re on-call and your pager is going off with a critical alert. You flip a light on, roll out of bed, and groggily open your laptop. Maybe it’s nothing and you can go back to bed, addressing whatever it is in the morning. You log on, silence the alert, and start digging into whatever’s going on. Something’s obviously not right: clients don’t seem to be connecting to your databases correctly. Or there’s some problem with the schema, but that wouldn’t make sense since no one should have pushed changes this late at night on a weekend. You start sifting through logs. You feel your pulse pick up as you notice strange logs from the databases. Really strange logs. Connection logs from IP addresses that you don’t recognize and aren’t within your VPC. Clients still aren’t able to connect so you decide to use the “break-glass” service account to investigate what’s going on inside one of your production databases and debug further. Maybe there’s a weird configuration that needs updating or something needs to be hard-reset to start working again.&lt;/p&gt;

&lt;p&gt;What you see startles you: every single row of your production database has garbled up messes of data, not the textual data you were expecting. Digging further in, you find a recent change to the schema and pushes from the database root account. One change in particular catches your attention: a new table called “ransom_note”. You pause, shocked, waiting to see if you’ll suddenly wake up from a bad dream. You cautiously begin to inspect the new table: &lt;em&gt;“SELECT COUNT(*) FROM ransom_note”&lt;/em&gt; returns only 1 row. &lt;em&gt;“SELECT * FROM ransom_note”&lt;/em&gt; reveals your worst suspicions: “&lt;em&gt;all your data has been encrypted, pay us 10 BTC to have the decryption key&lt;/em&gt;”.&lt;/p&gt;




&lt;p&gt;This is a nightmare scenario of almost every technology business owner, Chief Information Security Officer, and security red-team: a sudden and unexpected attack orchestrated through some unknown means that completely cripples your operations. Maybe it was a well orchestrated social engineering attack. Maybe it was an extremely unfortunate misconfiguration that let some bad actors into your networks. Or maybe it was a sophisticated supply-chain attack from one of the many hundreds of open source dependencies you have within your product’s stack.&lt;/p&gt;

&lt;p&gt;Supply-chain attacks have become very popular among nefarious actors for a few reasons: open source software is used nearly everywhere and many open source maintainers are spread incredibly thin. Open source software has become the critical infrastructure of the commons that we all depend on today. But it’s not unlikely to find solo-maintained or completely abandoned projects that have millions of downloads and sit in the critical dependency path within the software-supply-chain of many large enterprise products.&lt;/p&gt;

&lt;p&gt;A good example of this is the recent &lt;a href="https://en.wikipedia.org/wiki/XZ_Utils_backdoor"&gt;xz supply-chain attack against ssh&lt;/a&gt;: a malicious actor was able to inject a backdoor into ssh, a secure way to connect to other computers through a network, by adding nefarious code to the xz library, a lossless data compression library. In theory, if this had not been detected as early as it was, this would have given the nefarious actors a way to remotely execute code or gain access to any affected Linux computer. One thing that stands out in this example, like so many other supply-chain attacks, is the maintenance status of xz: it went relatively untouched with only a few people around to maintain it. Burned out, with no other volunteers, and very few resources to dedicate to the project, the attacker was easily able to slip in malicious code. Because of how burned out the maintainers were, the attacker automatically “inherits trust built up by the original maintainer”, using that good will to make nefarious changes.&lt;/p&gt;

&lt;p&gt;For further reading and analysis on the tragedy of the xz attack, I highly recommend &lt;a href="https://robmensching.com/blog/posts/2024/03/30/a-microcosm-of-the-interactions-in-open-source-projects/"&gt;this piece from Rob Mensching&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;While there’s no one catch-all solution for preventing these kinds of problems in open source, one piece of the bigger puzzle is the “Lottery Factor”: a metric that looks at open source communities and the weight and distribution of work being done by individuals within a project.&lt;/p&gt;

&lt;p&gt;The way we at OpenSauced are defining the Lottery Factor is a follows:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The minimum number of team members that have to suddenly disappear from a project (they won the lottery!) before the project stalls due to lack of knowledgeable or competent personnel. If 1 contributor makes over 50% of commits: Very high risk. 2 contributors make over 50% of commits: High risk. 3 to 5 contributors make over 50% of commits: Moderate risk. And over 5 contributors make over 50% of commits: Low risk.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The Lottery Factor can help uncover this sort of burnout and identify projects that need an injection of critical engineering resources. This can begin to give you an idea of how catastrophic it would be if someone who makes the majority of contributions in a project suddenly disappeared (because they won the lottery and went off to live their best life on the beach!). This may happen for any number of reasons and it’s important to note that the Lottery Factor is unique to each individual project: it’s not a hard and fast rule, but rather, another important metric in understanding the full story of a project.&lt;/p&gt;

&lt;p&gt;With all that in mind, we are very excited to unveil the inclusion of the Lottery Factor in &lt;a href="https://docs.opensauced.pizza/features/repo-pages/"&gt;OpenSauced Repo Pages&lt;/a&gt; as an additional metric and insight you can inspect!!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftjxx5p6oum3x3irfz7kk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftjxx5p6oum3x3irfz7kk.png" alt="analog repo page" width="800" height="495"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Through the lens of the Lottery Factor, we can begin to look at projects with a better understanding of where the critical “human” links in the secure software supply chain are, where funding resources need to be spent, and where to allocate crucial engineering resources.&lt;/p&gt;

&lt;p&gt;In the &lt;a href="https://app.opensauced.pizza/s/analogjs/analog"&gt;analogjs/analog&lt;/a&gt; example above, we can see that in the last 30 days, about 50% of contributions were made by ~2 contributors, 50% of that being &lt;a href="https://app.opensauced.pizza/user/brandonroberts"&gt;Brandon&lt;/a&gt;. This gives the overall Lottery factor as “High” and would start to unveil critical personnel in the Analog and Angular ecosystem.&lt;/p&gt;

&lt;p&gt;An example of a project where the lottery factor is critically high is &lt;a href="https://app.opensauced.pizza/s/zloirock/core-js?range=90"&gt;core-js&lt;/a&gt;, a widely used JavaScript standards library in use by Amazon, Netflix, and many other Fortune 500 companies across the web:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fefweezn6szdh3igw0eop.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fefweezn6szdh3igw0eop.png" alt="core-js repo page" width="800" height="494"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Over the last 90 days, the core maintainer “&lt;a href="https://beta.app.opensauced.pizza/user/zloirock"&gt;zloirock&lt;/a&gt;” has made the majority of the contributions. And, because of the wide adoption of core-js, this library could be a good candidate for an injection of critical resources to ensure the good standing and governance of the library.&lt;/p&gt;

&lt;p&gt;Now, let’s look at a project with a “Low” Lottery Factor over the last year where there are no single individuals with the majority of the commits, &lt;a href="https://app.opensauced.pizza/s/kubernetes/kubernetes"&gt;kubernetes/kubernetes&lt;/a&gt;:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw73tskfmnzjatruevk3d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw73tskfmnzjatruevk3d.png" alt="kubernetes repo page" width="800" height="494"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Because there are so many different people from so many different companies invested in the success of the Kubernetes platform and the cloud-native ecosystem, it makes sense that there are no single critical individuals that would be the sole point of failure if they were no longer working on the project.&lt;/p&gt;

&lt;p&gt;The Lottery Factor can help tell a story unique to each individual community and project. And it can help open source project offices, small teams, or individual contributors better understand the landscape of any open source project or piece of technology they depend on.&lt;/p&gt;

&lt;p&gt;We at OpenSauced hope this can start to help you understand where the critical human factor is within projects you contribute to and depend on! Make sure to &lt;a href="https://docs.opensauced.pizza/features/repo-pages/"&gt;check-out OpenSauced Repo Pages&lt;/a&gt; and stay saucey everyone!&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>github</category>
    </item>
    <item>
      <title>How We Saved 10s of Thousands of Dollars Deploying Low Cost Open Source AI Technologies At Scale with Kubernetes</title>
      <dc:creator>John McBride</dc:creator>
      <pubDate>Tue, 14 May 2024 05:54:00 +0000</pubDate>
      <link>https://dev.to/opensauced/how-we-saved-10s-of-thousands-of-dollars-deploying-low-cost-open-source-ai-technologies-at-scale-with-kubernetes-57j8</link>
      <guid>https://dev.to/opensauced/how-we-saved-10s-of-thousands-of-dollars-deploying-low-cost-open-source-ai-technologies-at-scale-with-kubernetes-57j8</guid>
      <description>&lt;p&gt;When you first start building AI applications with generative AI, you'll likely end up using OpenAI's API at some point in your project's journey. And for good reason! Their API is well-structured, fast, and supported by great libraries. At a small scale or when you’re just getting started, using OpenAI can be relatively economical. There’s also a huge amount of really great educational material out there that walks you through the process of building AI applications and understanding complex techniques using OpenAI’s API.&lt;/p&gt;

&lt;p&gt;One of my personal favorite OpenAI resources these days is the &lt;a href="https://cookbook.openai.com/"&gt;OpenAI Cookbook&lt;/a&gt;: this is an excellent way to start learning how their different models work, how to start taking advantage of the many cutting edge techniques in the AI space, and how to start integrating your data with AI workloads.&lt;/p&gt;

&lt;p&gt;However, as soon as you need to scale up your generative AI operations, you'll quickly encounter a pretty significant obstacle: the cost. Once you start generating thousands (and eventually tens of thousands) of texts via GPT-4, or even the lower-cost GPT-3.5 models, you'll quickly find your OpenAI bill is also growing into the thousands of dollars every month.&lt;/p&gt;

&lt;p&gt;Thankfully, for small and agile teams, there are a lot of great options out there for deploying low cost open source technologies to reproduce an OpenAI compatible API that uses the latest and greatest of the very solid open source models (which in many cases, rival the performance of the GPT 3.5 class of models).&lt;/p&gt;

&lt;p&gt;This is the very situation we at OpenSauced found ourselves in when building the infrastructure &lt;a href="https://oss.fyi/wait-starsearch"&gt;for our new AI offering, StarSearch&lt;/a&gt;: we needed a data pipeline that would continuously get summaries and embeddings of GitHub issues and pull requests in order to do a &lt;em&gt;“needle in the haystack”&lt;/em&gt; cosine similarity search in our vector store as part of a Retrieval Augmented Generation (RAG) flow. RAG is a very popular technique that enables you to provide additional context and search results to a large language model where it wouldn’t have that information in its foundational data otherwise. In this way, an LLM’s answers can be much more accurate for queries that you can "augment" with data you’ve given it context on.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffqu40vr0tcgd0ati84rn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffqu40vr0tcgd0ati84rn.png" alt="Simple RAG" width="800" height="455"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Cosine similarity search on top of a vector store is a way to enhance this RAG flow even further: because much of our data is unstructured and would be very difficult to parse through using a full text search, we’ve created vector embeddings on AI generated summaries of relevant rows in our database that we want to be able to search on. Vectors are really just a list of numbers but they represent an “understanding” from an embedding machine learning model that can be used with query vector embeddings to find the “nearest neighbor” data to the end users question.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgp9ycosldmbb8wj3mluh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgp9ycosldmbb8wj3mluh.png" alt="Advanced RAG techniques with vector store" width="800" height="411"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Initially, for the summary generation part of our RAG data pipeline, we were using OpenAI directly and wanted to target "knowing" about the events and communities of the top 40,000+ repositories on GitHub. This way, anyone could ask about and gain unique insights into what's going on across the most prominent projects in the open source ecosystem. But, since new issues and pull request events are always flowing through this pipeline, on any one given day, upwards of 100,000 new events for the 40,000+ repos would flow through to have summaries generated: that’s a lot of calls to the OpenAI API!!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr664t6fbu88eckqrijya.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr664t6fbu88eckqrijya.png" alt="Vector generation data pipeline architecture" width="800" height="380"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;At this kind of scale, we quickly ran into "cost" bottlenecks: we considered further optimizing our usage of OpenAI's APIs to reduce our overall usage, but felt that there was a powerful path forward by using open source technologies at a significantly lower cost to accomplish the same goal at our target scale.&lt;/p&gt;

&lt;p&gt;And while this post won’t get too deep into how we implemented the actual RAG part of StarSearch, we will look at how we bootstrapped the infrastructure to be able to consume many tens of thousands of GitHub events, generate AI summaries from them, and surface those as part of a nearest neighbor search using vLLM and Kubernetes. This was the biggest unlock to getting StarSearch to be able to surface relevant information about various technologies and "know" about what's going on across the open source ecosystem.&lt;/p&gt;

&lt;p&gt;There’s a lot more that could be said about RAG and vector search - I recommend the following resources:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://learnbybuilding.ai/tutorials/rag-from-scratch"&gt;A beginner's guide to building a Retrieval Augmented Generation (RAG) application from scratch&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://stackoverflow.blog/2023/10/09/from-prototype-to-production-vector-databases-in-generative-ai-applications/"&gt;Vector databases in generative AI applications&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.datastax.com/guides/what-is-cosine-similarity"&gt;What is Cosine Similarity: A Comprehensive Guide&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Running open source inference engines locally
&lt;/h3&gt;

&lt;p&gt;Today, thanks to the power and ingenuity of the open source ecosystem, there are a lot of great options for running AI models and doing "generative inference" on your own hardware.&lt;/p&gt;

&lt;p&gt;A few of the most prominent that come to mind are llama.cpp, vLLM, llamafile, llm, gpt4all, and the Huggingface transformers. One of my personal favorites is &lt;a href="https://app.opensauced.pizza/s/ollama/ollama"&gt;Ollama&lt;/a&gt;: it allows me to easily run an LLM with &lt;code&gt;ollama run&lt;/code&gt; on the command line of my MacBook. All of these, with their own spin and flavors on the open source AI space, provide a very solid way for you to run open source large language models (like Meta's llama3, Mistral's mixtral model, etc.) locally on your own hardware without the need for a third party API.&lt;/p&gt;

&lt;p&gt;Maybe even more importantly, these pieces of software are well optimized for running models on consumer grade hardware like personal laptops and gaming computers: you don't need a cluster of enterprise grade GPUs or an expensive third party service in order to start playing around with generating text! You can get started today and start building AI applications right from your laptop using open source technology with no 3rd party API.&lt;/p&gt;

&lt;p&gt;This is exactly how I started transitioning our generative AI pipelines from OpenAI to a service we run on top of Kubernetes for &lt;a href="https://app.opensauced.pizza/star-search"&gt;StarSearch&lt;/a&gt;: I started simple with Ollama running a Mistral model locally on my laptop. Then, I began transitioning our OpenAI data pipelines that read from our database and generate summaries to start using my local Ollama server. Ollama, along with many of the other inference engines out there, provide an OpenAI compatible API. Using this, I didn’t have to re-write much of the client code: simply replace the OpenAI API endpoint with the &lt;code&gt;localhost&lt;/code&gt; pointed to Ollama.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhu91g0sstwnrck4m5hwn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhu91g0sstwnrck4m5hwn.png" alt="Using Ollama locally" width="800" height="386"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Choosing vLLM for production
&lt;/h3&gt;

&lt;p&gt;Eventually, I ran into a real bottleneck using Ollama: it didn't support servicing concurrent clients. And, at the kind of scale we're targeting, at any given time, we likely need a couple dozen of our data pipeline microservice runners to all concurrently be batch processing summaries from the generative AI service all at once. This way, we could keep up with the constant load from over 40,000+ repos on GitHub. Obviously OpenAI's API can handle this kind of load, but how would we replicate this with our own service?&lt;/p&gt;

&lt;p&gt;Eventually, I found &lt;a href="https://app.opensauced.pizza/s/vllm-project/vllm"&gt;vLLM&lt;/a&gt;, a fast inference runner that can service multiple clients behind an OpenAI compatible API and take advantage of multiple GPUs on a given computer with request batching and an efficient use of &lt;em&gt;"PagedAttention"&lt;/em&gt; when doing inference. Also like Ollama, the vLLM community provides a container runtime image which makes it very easy to use on a number of different production platforms. Excellent!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F27ckva495gwas41bbv79.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F27ckva495gwas41bbv79.png" alt="Using vLLM at scale" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Note to the reader:&lt;/em&gt; Ollama very recently merged changes to support concurrent clients. At the time of this writing, it was not supported in the main upstream image, but I’m very excited to see how it performs compared to other multi-client inference engines!&lt;/p&gt;

&lt;h3&gt;
  
  
  Running vLLM locally
&lt;/h3&gt;

&lt;p&gt;To run vLLM locally, you’ll need a linux system and a python runtime:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python &lt;span class="nt"&gt;-m&lt;/span&gt; vllm.entrypoints.openai.api_server &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--model&lt;/span&gt; mistralai/Mistral-7B-Instruct-v0.2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will start the OpenAI compatible server which you can then hit locally on port 8000:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl http://localhost:8000/v1/models
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"object"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"list"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"data"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mistralai/Mistral-7B-Instruct-v0.2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"object"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"created"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1715528945&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"owned_by"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"vllm"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"root"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mistralai/Mistral-7B-Instruct-v0.2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"parent"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"permission"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
                    &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"modelperm-020c373d027347aab5ffbb73cc20a688"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                    &lt;/span&gt;&lt;span class="nl"&gt;"object"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"model_permission"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                    &lt;/span&gt;&lt;span class="nl"&gt;"created"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1715528945&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                    &lt;/span&gt;&lt;span class="nl"&gt;"allow_create_engine"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                    &lt;/span&gt;&lt;span class="nl"&gt;"allow_sampling"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                    &lt;/span&gt;&lt;span class="nl"&gt;"allow_logprobs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                    &lt;/span&gt;&lt;span class="nl"&gt;"allow_search_indices"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                    &lt;/span&gt;&lt;span class="nl"&gt;"allow_view"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                    &lt;/span&gt;&lt;span class="nl"&gt;"allow_fine_tuning"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                    &lt;/span&gt;&lt;span class="nl"&gt;"organization"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"*"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                    &lt;/span&gt;&lt;span class="nl"&gt;"group"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                    &lt;/span&gt;&lt;span class="nl"&gt;"is_blocking"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Alternatively, to run a container with the OpenAI compatible API, you can use docker on your linux system:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;--runtime&lt;/span&gt; nvidia &lt;span class="nt"&gt;--gpus&lt;/span&gt; all &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-v&lt;/span&gt; ~/.cache/huggingface:/root/.cache/huggingface &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-p&lt;/span&gt; 8000:8000 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--ipc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;host &lt;span class="se"&gt;\&lt;/span&gt;
    vllm/vllm-openai:latest &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--model&lt;/span&gt; mistralai/Mistral-7B-Instruct-v0.2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will mount the local Huggingface cache on my linux machine and use the host network. Then, using localhost again, we can hit the OpenAI compatible server running on docker. Let’s do a chat completion now:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl localhost:8000/v1/chat/completions &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
      "model": "TheBloke/Mistral-7B-Instruct-v0.2-AWQ",
      "messages": [
        {"role": "user", "content": "Who won the world series in 2020?"}
      ]
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"cmpl-9f8b1a17ee814b5db6a58fdfae107977"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"object"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"chat.completion"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"created"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1715529007&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mistralai/Mistral-7B-Instruct-v0.2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"choices"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"index"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"assistant"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The Major League Baseball (MLB) World Series in 2020 was won by the Tampa Bay Rays. They defeated the Los Angeles Dodgers in six games to secure their first-ever World Series title. The series took place from October 20 to October 27, 2020, at Globe Life Field in Arlington, Texas."&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"logprobs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"finish_reason"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"stop"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"stop_reason"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"usage"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"prompt_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;21&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"total_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;136&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"completion_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;115&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Using Kubernetes for a large scale vLLM service
&lt;/h3&gt;

&lt;p&gt;Running vLLM locally works just fine for testing, developing, and experimenting with inference, but at the kind of scale we're targeting, I knew we'd need some kind of environment that could easily handle any number of compute instances with GPUs, scale up with our needs, and load balance vLLM behind an agnostic service that our data pipeline microservices could hit at a production rate: enter Kubernetes, a familiar and popular container orchestration system!&lt;/p&gt;

&lt;p&gt;This, in my opinion, is a perfect use case for Kubernetes and would make scaling up an internal AI service that looked like OpenAI's API relatively seamless.&lt;/p&gt;

&lt;p&gt;In the end, the architecture for this kind of deployment looks like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Deploy any number of Kubernetes nodes with any number of GPUs on each node into a nodepool

&lt;ul&gt;
&lt;li&gt;Install GPU drivers per the managed Kubernetes service provider instructions. We're using Azure AKS so &lt;a href="https://learn.microsoft.com/en-us/azure/aks/gpu-cluster?tabs=add-ubuntu-gpu-node-pool"&gt;they provide these instructions for utilizing GPUs on cluster.&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;Deploy a daemonset for vLLM to run on each node with a GPU&lt;/li&gt;
&lt;li&gt;Deploy a Kubernetes service to load balance internal requests to vLLM's OpenAI compatible API&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc0zwn4dmxvphjxvmzbdp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/cdn-cgi/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc0zwn4dmxvphjxvmzbdp.png" alt="kubernetes architecture" width="800" height="577"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Getting the cluster ready
&lt;/h3&gt;

&lt;p&gt;If you're following along at home and looking to reproduce these results, I'm assuming at this point you have a Kubernetes cluster already up and running, likely through a managed Kubernetes provider, and have also installed the necessary GPU drivers onto the nodes that have GPUs.&lt;/p&gt;

&lt;p&gt;Again, on Azure’s AKS, where we deployed this service, we needed to run a daemonset that installs the Nvidia drivers for us on each of the nodes with a GPU:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apps/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;DaemonSet&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nvidia-device-plugin-daemonset&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;gpu-resources&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;matchLabels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nvidia-device-plugin-ds&lt;/span&gt;
  &lt;span class="na"&gt;template&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nvidia-device-plugin-ds&lt;/span&gt;
    &lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mcr.microsoft.com/oss/nvidia/k8s-device-plugin:v0.14.1&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nvidia-device-plugin-ctr&lt;/span&gt;
        &lt;span class="na"&gt;securityContext&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;capabilities&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;drop&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;All&lt;/span&gt;
        &lt;span class="na"&gt;volumeMounts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;mountPath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/var/lib/kubelet/device-plugins&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;device-plugin&lt;/span&gt;
      &lt;span class="na"&gt;nodeSelector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;accelerator&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nvidia&lt;/span&gt;
      &lt;span class="na"&gt;tolerations&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;CriticalAddonsOnly&lt;/span&gt;
        &lt;span class="na"&gt;operator&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Exists&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;effect&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;NoSchedule&lt;/span&gt;
        &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nvidia.com/gpu&lt;/span&gt;
        &lt;span class="na"&gt;operator&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Exists&lt;/span&gt;
      &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;hostPath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/var/lib/kubelet/device-plugins&lt;/span&gt;
          &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;device-plugin&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This daemonset installs the Nvidia device plugin pod on each node that has the node selector &lt;code&gt;accelerator: nvidia&lt;/code&gt; and can tolerate a few taints from the system. Again, this is more or less platform specific but this enables our AKS cluster to have the necessary drivers for the nodes that have GPUs so vLLM can take full advantage of those compute units.&lt;/p&gt;

&lt;p&gt;Eventually, we end up with a cluster node configuration that has the default nodes and the nodes with GPUs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;❯ kubectl get nodes -A
NAME                    STATUS   ROLES  AGE   VERSION
defaultpool-88943984-0   Ready  &amp;lt;none&amp;gt;   5d v1.29.2
defaultpool-88943984-1   Ready  &amp;lt;none&amp;gt;   5d v1.29.2
gpupool-42074538-0      Ready   &amp;lt;none&amp;gt;   41h   v1.29.2
gpupool-42074538-1      Ready   &amp;lt;none&amp;gt;   41h   v1.29.2
gpupool-42074538-2      Ready   &amp;lt;none&amp;gt;   41h   v1.29.2
gpupool-42074538-3      Ready   &amp;lt;none&amp;gt;   41h   v1.29.2
gpupool-42074538-4      Ready   &amp;lt;none&amp;gt;   41h   v1.29.2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each of these nodes has a gpu device plugin pod managed by the daemonset where the drivers get installed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;❯ kubectl get daemonsets.apps -n gpu-resources
NAME                             DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR        AGE
nvidia-device-plugin-daemonset   5         5         5       5            5           accelerator=nvidia   41h
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One thing to note for this setup: each of these gpu nodes have a &lt;code&gt;accelerator: nvidia&lt;/code&gt; label and taints for &lt;code&gt;nvidia.com/gpu&lt;/code&gt;. These are to ensure that no other pods are scheduled on these nodes since we anticipate vLLM consuming all the compute and GPU resources on each of these nodes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Deploying a vLLM DaemonSet
&lt;/h3&gt;

&lt;p&gt;In order to take full advantage of each of the GPUs deployed on the cluster, we can deploy an additional vLLM daemonset that also selects for each of the Nvidia GPU nodes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apps/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;DaemonSet&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;vllm-daemonset-ec9831c8&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;vllm-ns&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;matchLabels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;vllm&lt;/span&gt;
  &lt;span class="na"&gt;template&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;vllm&lt;/span&gt;
    &lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;--model&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;mistralai/Mistral-7B-Instruct-v0.2&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;--gpu-memory-utilization&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;0.95"&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;--enforce-eager&lt;/span&gt;
        &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;HUGGING_FACE_HUB_TOKEN&lt;/span&gt;
          &lt;span class="na"&gt;valueFrom&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;secretKeyRef&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;HUGGINGFACE_TOKEN&lt;/span&gt;
              &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;vllm-huggingface-token&lt;/span&gt;
        &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;vllm/vllm-openai:latest&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;vllm&lt;/span&gt;
        &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;containerPort&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8000&lt;/span&gt;
          &lt;span class="na"&gt;protocol&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;TCP&lt;/span&gt;
        &lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;limits&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;nvidia.com/gpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1"&lt;/span&gt;
      &lt;span class="na"&gt;nodeSelector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;accelerator&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nvidia&lt;/span&gt;
      &lt;span class="na"&gt;tolerations&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;effect&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;NoSchedule&lt;/span&gt;
        &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nvidia.com/gpu&lt;/span&gt;
        &lt;span class="na"&gt;operator&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Exists&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let’s break down what’s going on here:&lt;/p&gt;

&lt;p&gt;First, we create the metadata and label selectors for the vllm daemonset pods on the cluster. Then, in the container spec, we provide the arguments to the vLLM container running on the cluster. You’ll notice a few things here: we’re utilizing about 95% of GPU memory in this deployment and we are enforcing CUDA eager mode (which helps with memory consumption while trading off inference performance). One of the things I like about vLLM is its many options for tuning and running on different hardware: there are lots of capabilities for tweaking how the inference works or how your hardware is consumed. &lt;a href="https://docs.vllm.ai/en/latest/"&gt;So check out the vLLM docs for further reading!&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Next, you’ll notice we provide a Huggingface token: this is so that vLLM can pull down the model from Huggingface’s API and bypass any “gated” models that we’ve been given permission to access.&lt;/p&gt;

&lt;p&gt;Next, we expose port 8000 for the pod. This will be used latter in a service to select for these pods and provide an agnostic way to hit a load balanced endpoint for any of the various deployed vLLM pods on port 8000. Then, we use a &lt;code&gt;nvidia.com/gpu&lt;/code&gt; resource (which is provided as a node level resource by the Nvidia device plugin daemonset - again, depending on your managed Kubernetes provider and how you installed the GPU drivers, this may varry).  And finally, we provide the same node selector and taint tolerations to ensure that vLLM runs only on the GPU nodes! Now, when we deploy this, we’ll see the vLLM daemonset has successfully deployed onto each of the GPU nodes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;❯ kubectl get daemonsets.apps -n vllm-ns
NAME                      DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR        AGE
vllm-daemonset-ec9831c8   5         5         5       5            5           accelerator=nvidia   41h
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Load balancing with an internal Kubernetes service
&lt;/h3&gt;

&lt;p&gt;In order to provide a OpenAI like API to other microservices internally on the cluster, we can apply a Kubernetes service that selects for the vllm pods in the vllm namespace:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Service&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;vllm-service&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;vllm-ns&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;80&lt;/span&gt;
    &lt;span class="na"&gt;protocol&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;TCP&lt;/span&gt;
    &lt;span class="na"&gt;targetPort&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8000&lt;/span&gt;
  &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;vllm&lt;/span&gt;
  &lt;span class="na"&gt;sessionAffinity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;None&lt;/span&gt;
  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ClusterIP&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This simply selects for &lt;code&gt;app: vllm&lt;/code&gt; pods and targets the vLLM 8000 port. This then will get picked up by the internal Kubernetes DNS server and we can use the resolved “vllm-service.vllm-ns” endpoint to be load balanced to one of the vLLM APIs.&lt;/p&gt;

&lt;h3&gt;
  
  
  Results
&lt;/h3&gt;

&lt;p&gt;Let's hit this vLLM Kubernetes service endpoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# hitting the vllm-service internal api endpoint resolved by Kubernetes DNS&lt;/span&gt;

curl vllm-service.vllm-ns.svc.cluster.local/v1/chat &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
     "model": "mistralai/Mistral-7B-Instruct-v0.2",
     "prompt": "Why is the sky blue?"
}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This "vllm-service.vllm-ns" internal Kubernetes service domain name will resolve to one of the nodes running a vLLM daemonset (again, load-balanced across all the running vLLM pods) and will return inference generation for the prompt "Why is the sky blue?":&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"cmpl-76cf74f9b05c4026aef7d64c06c681c4"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"object"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"chat.completion"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"created"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1715533000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mistralai/Mistral-7B-Instruct-v0.2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"choices"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"index"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"assistant"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The color of the sky appears blue due to a natural phenomenon called Rayleigh scattering. As sunlight reaches Earth's atmosphere, it interacts with molecules and particles in the air, such as nitrogen and oxygen. These particles scatter short-wavelength light, like blue and violet light, more than longer wavelengths, like red, orange, and yellow. However, we perceive the sky as blue and not violet because our eyes are more sensitive to blue light and because sunlight reaches us more abundantly in the blue part of the spectrum.&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;Additionally, some of the violet light gets absorbed by the ozone layer in the stratosphere, which prevents us from seeing a violet sky. At sunrise and sunset, the sky can take on hues of red, orange, and pink due to the scattering of sunlight through the Earth's atmosphere at those angles."&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"logprobs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"finish_reason"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"stop"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"stop_reason"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"usage"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"prompt_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"total_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;201&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"completion_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;186&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;In the end, this provides our internal microservices running on the cluster a way to generate summaries without having to use an expensive 3rd party API: we found that we’ve gotten very good results from using the Mistral models and, for this use case at this scale, using a service we run on some GPUs has been significantly more economical.&lt;/p&gt;

&lt;p&gt;You could expand on this and provide some additional networking policy or configurations to your internal service or even add an ingress controller to provide this as service to others outside of your cluster. The sky is the limit with what you can do from here! Good luck, and stay saucey!&lt;/p&gt;

&lt;p&gt;If you want to check out StarSearch, &lt;a href="https://oss.fyi/wait-starsearch"&gt;join our waitlist now&lt;/a&gt;!&lt;/p&gt;

</description>
      <category>ai</category>
      <category>devops</category>
      <category>kubernetes</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Awk: A beginners guide for humans</title>
      <dc:creator>John McBride</dc:creator>
      <pubDate>Sun, 03 Mar 2024 22:16:52 +0000</pubDate>
      <link>https://dev.to/jpmcb/awk-a-beginners-guide-for-humans-3l25</link>
      <guid>https://dev.to/jpmcb/awk-a-beginners-guide-for-humans-3l25</guid>
      <description>&lt;p&gt;Earlier this week, I had a file of names, each delimited by a newline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;john
jack
jill
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But really, I needed this file to be in the form:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
    "full_name": "name"
},
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This file wasn't absolutely huge, but it was big enough that editing it manually would have been annoying. I thought to myself, "instead of editing this file manually or generating it correctly, how can I spend the maximum amount of time using a bespoke tool to get it in the right format? A neovim macro? Sed? Write some python? Why not awk!"&lt;/p&gt;

&lt;p&gt;In the end, here's the awk command I used:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="s1"&gt;'{print "{\n    \"full_name\": \"" $0 "\"\n},"}'&lt;/span&gt; names.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This printed each line surrounded by the appropriate curly braces and whitespace.&lt;/p&gt;




&lt;p&gt;Let's break down how I did this and build the command one bit at a time:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Awk is a Linux command line utility just like any other. But, similar to something like like python or lua, it's a special program interpreter that is especially good at scanning and processing inputs with small (or big) one liner programs you give it.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="s1"&gt;'&amp;lt;an-awk-program&amp;gt;'&lt;/span&gt; some-input-file
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Let's start simple and just print the names from the file directly to stdout:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="s1"&gt;'{print $0}'&lt;/span&gt; names.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;john
jack
jill
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;within the &lt;code&gt;''&lt;/code&gt;, we provide awk with a small program it will execute. This is basically the "hello world" of awk: it just takes each line and prints it out just like it is, unedited, in the file.&lt;/p&gt;

&lt;p&gt;But what is &lt;code&gt;$0&lt;/code&gt;? Awk has the concept of "columns" in a file: these are typically space delimited. So a file like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1 2 3
4 5 6 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;has 3 columns and 2 rows.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;$0&lt;/code&gt; variable is a special one and represents the entire row of arguments. Then, each &lt;code&gt;$N&lt;/code&gt; is the N-th (where 1 is the first column) argument in that row.&lt;/p&gt;

&lt;p&gt;So, if we only wanted the 1st column in the above file with 3 columns, we could run the following awk program:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="s1"&gt;'{print $1}'&lt;/span&gt; numbers.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1
4
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If we only wanted the 2nd and 3rd columns, we could run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="s1"&gt;'{print $2 " " $3}'&lt;/span&gt; numbers.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;2 3
5 6
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;(notice the blank &lt;code&gt;" "&lt;/code&gt; we provide as a string to force some whitespace formatting so the columns are closer to what exists in the original file.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Next, lets add in some additional text to print out:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="s1"&gt;'{print "{\"full_name\": \"" $0 "\"},"}'&lt;/span&gt; names.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;First thing you'll notice is a confusing array of &lt;code&gt;"&lt;/code&gt; - the first &lt;code&gt;"&lt;/code&gt; denotes the beginning of a string output for awk to print. The subsequent &lt;code&gt;\"&lt;/code&gt; are literal escaped quotes which we &lt;em&gt;want&lt;/em&gt; to appear in the output. We eventually end the first string with a standalone &lt;code&gt;"&lt;/code&gt; to then print the line with the &lt;code&gt;$0&lt;/code&gt; variable and then we enter a string again to add the trailing bracket &lt;code&gt;}&lt;/code&gt; and comma &lt;code&gt;,&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;When run, this outputs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{"full_name": "john"},
{"full_name": "jack"},
{"full_name": "jill"},
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Now we're getting somewhere! Let's finish this off by adding the additional white spacing:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="s1"&gt;'{print "{\n    \"full_name\": \"" $0 "\"\n},"}'&lt;/span&gt; names.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
    "full_name": "john"
},
{
    "full_name": "jack"
},
{
    "full_name": "jill"
},
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The added whitespace within the strings (by including the literal escaped newlines &lt;code&gt;\n&lt;/code&gt;) are printed to give the correct, desired output!&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Bonus: what if we wanted to remove the trailing comma? What if we wanted to wrap this all in &lt;code&gt;[...]&lt;/code&gt; to be closer to valid json? Yeah, yeah, I know, &lt;code&gt;jq&lt;/code&gt; exists, but by the power of our lord and savior awk, all things possible!!&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;To remove the trailing comma, we can use a sliding window technique:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="s1"&gt;'NR &amp;gt; 1 {print prev ","} {prev = "{\n    \"full_name\": \"" $0 "\"\n}"} END {print prev}'&lt;/span&gt; names.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This introduces abit more complexity.&lt;/p&gt;

&lt;p&gt;First, we add the &lt;code&gt;NR&lt;/code&gt; concept: &lt;code&gt;NR&lt;/code&gt; is the "number of records". This can be really useful for checking progress, doing different things based on number of records processed, etc.&lt;/p&gt;

&lt;p&gt;So, after the first record, we print the comma. We also always store the "previous" chunk in a &lt;code&gt;prev&lt;/code&gt; variable: this is the N + 1 sliding window. Nothing actually happens when the first record is processed, it's line output is simply stored in the &lt;code&gt;prev&lt;/code&gt; variable to be printed on the next iteration. This way, we're always one behind the current record and when we reach the very end (using the &lt;code&gt;END&lt;/code&gt; keyword), we can print the previous chunk without the trailing comma!&lt;/p&gt;

&lt;p&gt;To wrap it up the entire output in a square bracket and give it the correct spacing, we can use this awk program:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight awk"&gt;&lt;code&gt;&lt;span class="kr"&gt;BEGIN&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;# Print the opening bracket for the JSON array&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="s2"&gt;"["&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kc"&gt;NR&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;# after the first line, print the previously stored chunk&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="nx"&gt;prev&lt;/span&gt; &lt;span class="s2"&gt;","&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;# Store the current line in a JSON object format&lt;/span&gt;
    &lt;span class="nx"&gt;prev&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"    {\n        \"full_name\": \""&lt;/span&gt; &lt;span class="nv"&gt;$0&lt;/span&gt; &lt;span class="s2"&gt;"\"\n    }"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kr"&gt;END&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;# Print the last line stored in prev and close the JSON array&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="nx"&gt;prev&lt;/span&gt; &lt;span class="s2"&gt;"\n]"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can run this awk program via a file instead of doing all of that on the command line directly. This greatly helps with readability, maintainability, etc.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; format_names.awk names.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[
    {
        "full_name": "john"
    },
    {
        "full_name": "jack"
    },
    {
        "full_name": "jill"
    }
]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Just like the previous awk program, we are printing each segment and then at the end, leaving off the trailing comma. But this time, at the beginning of the program, using &lt;code&gt;BEGIN&lt;/code&gt; and &lt;code&gt;END&lt;/code&gt;, we print an opening and closing bracket.&lt;/p&gt;




&lt;p&gt;Happy awk-ing and good luck!&lt;/p&gt;

</description>
      <category>linux</category>
      <category>cli</category>
      <category>terminal</category>
    </item>
    <item>
      <title>Job scheduling with tmux</title>
      <dc:creator>John McBride</dc:creator>
      <pubDate>Mon, 15 Jan 2024 23:41:43 +0000</pubDate>
      <link>https://dev.to/jpmcb/job-scheduling-with-tmux-5hb4</link>
      <guid>https://dev.to/jpmcb/job-scheduling-with-tmux-5hb4</guid>
      <description>&lt;p&gt;Tmux is one of my favorite utilities: it's a terminal multiplexer that lets you create persistent shell sessions, panes, windows, etc. all within a single terminal. It's a great way to organize your shell sessions and natively give you multi-shell environments to work in without having to rely on a terminal program for those features.&lt;/p&gt;

&lt;p&gt;You'd think in a world of modern applications and fancy terminals like iTerm 2 and Kitty, you wouldn't need such a utility. But time and time again, tmux has proven itself to be a powerful and essential tool. Especially when working with remote machines in the cloud or across SSH sessions, tmux is critical in maintaining my organization and getting things done.&lt;/p&gt;

&lt;p&gt;Beyond multiplexing, tmux has some incredible capabilities that extend its functionality to be able to run and schedule jobs, automatically execute scripts within given contexts, and much more.&lt;/p&gt;

&lt;p&gt;Let's look at a few use cases where we can schedule jobs to run and even create a whole production like environment, all organized and managed from tmux!&lt;/p&gt;

&lt;h2&gt;
  
  
  Running commands
&lt;/h2&gt;

&lt;p&gt;Tmux offers a way to run scripts in new sessions automatically:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;tmux new &lt;span class="nt"&gt;-s&lt;/span&gt; my-session &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-c&lt;/span&gt; /path/to/directory &lt;span class="s1"&gt;'echo "Hello Tmux!" &amp;amp;&amp;amp; sleep 100'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's break this down: this arbitrary example creates a new session named "my-session", sets the session directory using the &lt;code&gt;-c&lt;/code&gt; flag, and then executes a command.&lt;/p&gt;

&lt;p&gt;This command will echo "Hello Tmux!" and then sleep for 100 seconds.&lt;/p&gt;

&lt;p&gt;When running this tmux command, we are automatically attached to the session and see "Hello Tmux!" printed at the top of the screen and then the &lt;code&gt;sleep&lt;/code&gt; command takes over. Once the &lt;code&gt;sleep&lt;/code&gt; command is done, the session exits.&lt;/p&gt;

&lt;p&gt;If we wanted to run this in the background, we could provide the &lt;code&gt;-d&lt;/code&gt; flag: this will keep the new session detached and run the given commands behind the scenes in the background.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ tmux new -s my-session 
  -d -c ~/workspace 'echo "hello world!" &amp;amp;&amp;amp; sleep 1000'

$ tmux ls
my-session: 1 windows (created Mon Jan 15 11:02:21 2024)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Using &lt;code&gt;tmux ls&lt;/code&gt; we can list out the current sessions and see &lt;code&gt;my-session&lt;/code&gt; is running with 1 window in the background. This is part of the power of tmux: you can have sessions exist and persist &lt;em&gt;outside&lt;/em&gt; of the current shell or session you are attached to. The sky is really the limit here and using multiple sessions, windows, and panes has become a cornerstone of my workflows.&lt;/p&gt;

&lt;p&gt;If we wanted to attach to the session and see the progress of the command we gave it, we could run &lt;code&gt;tmux a -t my-session&lt;/code&gt;. This will attach to the session named &lt;code&gt;my-session&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Persisting sessions
&lt;/h2&gt;

&lt;p&gt;This is all great, but not all that useful when need to latter observe the results of our command or persist the history: running a script for a new session or window or pane will automatically close once it's completed.&lt;/p&gt;

&lt;p&gt;Instead, we can use a regular session we create and send it some commands remotely:&lt;/p&gt;

&lt;p&gt;As an example, let's say we needed to run some tests in the background on our Typescript project with &lt;code&gt;npm run test&lt;/code&gt; and latter observe the results. We can do this with the &lt;code&gt;send keys&lt;/code&gt; command for sessions. Here, I'll be using the OpenSauced API as my playground:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a new named session:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create a new named, detached session&lt;/span&gt;
&lt;span class="c"&gt;# that starts in the given directory&lt;/span&gt;
tmux new &lt;span class="nt"&gt;-s&lt;/span&gt; my-npm-tests &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;-c&lt;/span&gt; ~/workspace/opensauced/api
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Send the command
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Send the test command to the session&lt;/span&gt;
tmux send-keys &lt;span class="nt"&gt;-t&lt;/span&gt; my-npm-tests &lt;span class="s2"&gt;"npm run test"&lt;/span&gt; Enter
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A few things to note here:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Enter&lt;/code&gt; uses the special "key binding syntax" for sending a literal &lt;code&gt;Enter&lt;/code&gt; key at the end of the command. If we needed to send something else, like "control c", we could do that with &lt;code&gt;C-c&lt;/code&gt; or &lt;code&gt;M-c&lt;/code&gt; for "alt c". Check the official man page where this has &lt;a href="http://man.openbsd.org%20OpenBSD-current/man1/tmux.1#KEY_BINDINGS"&gt;a full description&lt;/a&gt; of what's possible with sending key bindings to sessions.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Attach to the session:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;tmux a -t my-npm-tests
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now that we've sent our test command to the session, at any point in the future we can attach to the session to see how it did and check the results. Since the session will be persisted after the command has run, there's no rush to observe the results! The shell's full history for that session will be right there when we need it!&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Check results&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Within the attached session, we can see the full history of the &lt;code&gt;npm&lt;/code&gt; command that was sent and check the results! This session is persisted so we can use the shell from this session to do additional work, detach, close it, etc.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ npm run test
npm info using npm@9.6.7
npm info using node@v18.17.1

&amp;gt; @open-sauced/api@2.3.0-beta.2 test
&amp;gt; jest

npm info ok

$
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Script it!
&lt;/h2&gt;

&lt;p&gt;What if there are 5 or 6 things I want to do behind the scenes? Maybe I have a build and test process that can run many things in parallel at once? Instead of using &lt;code&gt;send-keys&lt;/code&gt; manually, let's create a small script that can do this all for us!&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/usr/bin/env bash&lt;/span&gt;

&lt;span class="c"&gt;# Create named, detached sessions&lt;/span&gt;
tmux new &lt;span class="nt"&gt;-s&lt;/span&gt; npm-test &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;-c&lt;/span&gt; ~/workspace/opensauced/api
tmux new &lt;span class="nt"&gt;-s&lt;/span&gt; npm-build &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;-c&lt;/span&gt; ~/workspace/opensauced/api

&lt;span class="c"&gt;# Send commands to the detached sessions&lt;/span&gt;
tmux send-keys &lt;span class="nt"&gt;-t&lt;/span&gt; npm-test &lt;span class="s2"&gt;"npm run test"&lt;/span&gt; Enter
tmux send-keys &lt;span class="nt"&gt;-t&lt;/span&gt; npm-build &lt;span class="s2"&gt;"npm run build"&lt;/span&gt; Enter
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Running this script yields the following tmux sessions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ tmux ls
npm-build: 1 windows (created Mon Jan 15 11:31:28 2024)
npm-test: 1 windows (created Mon Jan 15 11:31:28 2024)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and can be attached to in order to inspect the results of each command.&lt;/p&gt;

&lt;p&gt;If the commands to run within individual sessions is more complex than just a sole one liner, &lt;code&gt;send-keys&lt;/code&gt; can also run a script or &lt;code&gt;make&lt;/code&gt; command!&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;tmux send-keys &lt;span class="nt"&gt;-t&lt;/span&gt; kubernetes &lt;span class="s2"&gt;"make build"&lt;/span&gt; Enter
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this article, I'm assuming you always want to create a new session. But many of the same rules, flags, and syntaxes also apply to creating new windows, panes, etc. Tmux has a strong paradigm that is consistent across different ways to multi plex shells so it'd be just as simple to create 2 windows instead of two panes that we then send commands to:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/usr/bin/env bash&lt;/span&gt;

&lt;span class="c"&gt;# Create named windows&lt;/span&gt;
tmux new-window &lt;span class="nt"&gt;-n&lt;/span&gt; npm-test &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;-c&lt;/span&gt; ~/workspace/opensauced/api
tmux new-window &lt;span class="nt"&gt;-n&lt;/span&gt; npm-build &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;-c&lt;/span&gt; ~/workspace/opensauced/api

&lt;span class="c"&gt;# Send commands to the detached sessions&lt;/span&gt;
tmux send-keys &lt;span class="nt"&gt;-t&lt;/span&gt; 0:npm-test &lt;span class="s2"&gt;"npm run test"&lt;/span&gt; Enter
tmux send-keys &lt;span class="nt"&gt;-t&lt;/span&gt; 0:npm-build &lt;span class="s2"&gt;"npm run build"&lt;/span&gt; Enter
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A few things to note here: instead of &lt;code&gt;-s&lt;/code&gt; for the session name, we provide &lt;code&gt;-n&lt;/code&gt; for the new window name. You'll also notice the &lt;code&gt;send-keys&lt;/code&gt; syntax now includes a &lt;code&gt;:&lt;/code&gt;. The first part is the name of the session (in my case, session named &lt;code&gt;0&lt;/code&gt;) and the name of the window to send the keys to.&lt;/p&gt;

&lt;h3&gt;
  
  
  Setting env variables for sessions
&lt;/h3&gt;

&lt;p&gt;An important and powerful thing to remember here is environment variables: tmux provides the ability to denote global environment variables (env vars available to all new sessions) and session based env vars. In newer versions of tmux, I recommend setting the local session variable with the &lt;code&gt;-e&lt;/code&gt; flag:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;tmux new &lt;span class="nt"&gt;-s&lt;/span&gt; my-session &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;MYVAR&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;myvalue &lt;span class="nt"&gt;-c&lt;/span&gt; /dir
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This session named &lt;code&gt;my-session&lt;/code&gt; will have access to the &lt;code&gt;MYVAR&lt;/code&gt; environment variable we provided when creating the new session:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ echo $MYVAR
myval
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Scheduling jobs with &lt;code&gt;at&lt;/code&gt; and scripts
&lt;/h2&gt;

&lt;p&gt;One of the more powerful things I've used this all for is local job scheduling. Let's look at 2 examples using &lt;code&gt;at&lt;/code&gt; and scripts:&lt;/p&gt;

&lt;h3&gt;
  
  
  One off &lt;code&gt;at&lt;/code&gt; scheduling
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;at&lt;/code&gt; is a very basic command line utility that comes packaged with many desktop Linux distros and lets you do very simple one off scheduling.&lt;/p&gt;

&lt;p&gt;For example, let's say that you needed to do a git push 3 hours from now in a specific directory:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;tmux new &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;-s&lt;/span&gt; git-push-later &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-c&lt;/span&gt; /path/to/your/repo &lt;span class="s1"&gt;'echo "git push" | at now + 3 hours'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will create a new detached session named &lt;code&gt;git-push later&lt;/code&gt; within the directory for your git repo and it sends &lt;code&gt;git push&lt;/code&gt; to the &lt;code&gt;at&lt;/code&gt; command via a pipe with the argument "now + 3 hours".&lt;/p&gt;

&lt;p&gt;Looking at scheduled jobs via &lt;code&gt;at&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ at -l
1       Mon Jan 15 14:46:00 2024
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I can see there is a scheduled job! Cool!! This isn't &lt;em&gt;too&lt;/em&gt; much different than just running &lt;code&gt;at&lt;/code&gt; manually from the given current directory, but it can be really useful and powerful if I'm working in a different directory or need to quickly load up some env vars. Better yet, you can easily combine this into a script that loads some global tmux environments to then execute many &lt;code&gt;at&lt;/code&gt; commands in sequence.&lt;/p&gt;

&lt;h3&gt;
  
  
  Shell script scheduling
&lt;/h3&gt;

&lt;p&gt;There are &lt;em&gt;alot&lt;/em&gt; of ways in Linux to do what I'm suggesting here, primarily through &lt;code&gt;cron&lt;/code&gt; and &lt;code&gt;crontab&lt;/code&gt; but sometimes for a quick and dirty job that needs to run on repeat every so often in a background shell, it can be quick and dirty to just wrap what I'm doing in a loop with a sleep command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
    &lt;span class="c"&gt;# The command to continously run&lt;/span&gt;
    npm run &lt;span class="nb"&gt;test&lt;/span&gt;

    &lt;span class="c"&gt;# Sleep for 5 minutes between runs&lt;/span&gt;
    &lt;span class="nb"&gt;sleep &lt;/span&gt;5m
&lt;span class="k"&gt;done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This can then be thrown in a script and executed via a tmux &lt;code&gt;send-keys&lt;/code&gt; command like we've seen:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;tmux send-keys &lt;span class="nt"&gt;-t&lt;/span&gt; my-npm-tests &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="s2"&gt;"./run-tests-every-5-mins.sh"&lt;/span&gt; Enter
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why do it this way and not just have a cron job in the background?&lt;/p&gt;

&lt;p&gt;For observable things, like builds, tests, etc., I really like to have a persistent shell session that I can attach to, detach from, and occasionally keep track of.&lt;/p&gt;

&lt;p&gt;Usually with this method, these aren't things that are &lt;em&gt;too&lt;/em&gt; important, so if the tmux server dies, it's nothing I can't quickly spin back up with a little tmux script. It's nice having a sort of "location" where these jobs are running in the background but always reachable from a different tmux window or tab. I sometimes find I've lost track of things Linux abstracts away with &lt;code&gt;cron&lt;/code&gt;, &lt;code&gt;systemd&lt;/code&gt;, etc. (which is generally a good thing: I don't want to have to think about the things &lt;code&gt;systemd&lt;/code&gt; is managing!) So, instead, for the little things I need to keep an eye on, I choose to keep track of them in a tmux session!&lt;/p&gt;

&lt;h2&gt;
  
  
  Building production like environments
&lt;/h2&gt;

&lt;p&gt;Using all of this and with my weird tendency to keep track of things in tmux sessions, let's build a simple production like environment using a starter script, docker, and a few tmux sessions!&lt;/p&gt;

&lt;p&gt;Let's again look at an OpenSauced example: this starts a postgres database in docker, boots up the API (which will then attach to that database), and then starts the frontend:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/usr/bin/env bash&lt;/span&gt;

&lt;span class="c"&gt;# Create named, detached sessions&lt;/span&gt;
tmux new &lt;span class="nt"&gt;-s&lt;/span&gt; database &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;-c&lt;/span&gt; ~/workspace/opensauced/api
tmux new &lt;span class="nt"&gt;-s&lt;/span&gt; api &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;-c&lt;/span&gt; ~/workspace/opensauced/api
tmux new &lt;span class="nt"&gt;-s&lt;/span&gt; frontend &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;-c&lt;/span&gt; ~/workspace/opensauced/app

&lt;span class="c"&gt;# Start the database up&lt;/span&gt;
tmux send-keys &lt;span class="nt"&gt;-t&lt;/span&gt; database &lt;span class="s2"&gt;"docker run -it --rm --name database -p 25060:5432 my_postgres_image:latest"&lt;/span&gt; Enter

&lt;span class="c"&gt;# Start the API&lt;/span&gt;
tmux send-keys &lt;span class="nt"&gt;-t&lt;/span&gt; api &lt;span class="s2"&gt;"npm run start"&lt;/span&gt; Enter

&lt;span class="c"&gt;# Start the frontend app&lt;/span&gt;
tmux send-keys &lt;span class="nt"&gt;-t&lt;/span&gt; frontend &lt;span class="s2"&gt;"npm run start"&lt;/span&gt; Enter
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Horrifying, I know.&lt;/p&gt;

&lt;p&gt;But surprisingly, I've found this to be a really great way to keep the various components of our system organized in a system I know well and can easily wrap my head around.&lt;/p&gt;

&lt;p&gt;Then, when I'm done with this environment, I can easily tear it down by stopping the tmux sessions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;tmux kill-session database
tmux kill-session api
tmux kill-session frontend
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And that's it! Easy organization, job scheduling, and multi tasking with tmux! Let me know if you have questions!!&lt;/p&gt;

</description>
      <category>linux</category>
      <category>tmux</category>
      <category>cli</category>
      <category>terminal</category>
    </item>
    <item>
      <title>How we made our Go microservice 24x faster</title>
      <dc:creator>John McBride</dc:creator>
      <pubDate>Thu, 14 Sep 2023 15:34:49 +0000</pubDate>
      <link>https://dev.to/opensauced/how-we-made-our-go-microservice-24x-faster-5h3l</link>
      <guid>https://dev.to/opensauced/how-we-made-our-go-microservice-24x-faster-5h3l</guid>
      <description>&lt;p&gt;As data intensive backend applications scale and grow, with larger data sets scaled out to higher availability, performance bottlenecks can quickly become major hurdles. Processing requests that once took mere milliseconds can suddenly become multi-minute problems.&lt;/p&gt;

&lt;p&gt;In this blog post, let’s take a look at some recent optimization strategies the OpenSauced pizza micro-service recently underwent. This backend service is a Go server that processes git commits by request, sometimes processing thousands of commits in one single request. You can almost think of it as a real time batch processor that can be called by arbitrary clients to fetch and process git commits within an agnostic git repo.&lt;/p&gt;

&lt;p&gt;These commits eventually are all indexed within a Postgres database. Most of these optimizations revolve around “batching” the Postgres calls instead of going one by one.&lt;br&gt;
For simplicity in our examples, we’ll be using an arbitrary table called “my_table” with data that fits into the “my_data” column. Let’s dive in and take a look at how we can optimize!&lt;/p&gt;
&lt;h3&gt;
  
  
  Some setup first
&lt;/h3&gt;

&lt;p&gt;Before we can go too much further, let’s make sure the database connection is bootstrapped correctly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"database/sql"&lt;/span&gt;
    &lt;span class="s"&gt;"log"&lt;/span&gt;

    &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="s"&gt;"github.com/lib/pq"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// In a real world scenario, use good password handling practices&lt;/span&gt;
    &lt;span class="c"&gt;// to handle connecting to the Postgres cluster!&lt;/span&gt;
    &lt;span class="n"&gt;connectString&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="s"&gt;"host=my_host port=54321 user=my_postgres_user sslmode=require"&lt;/span&gt;

    &lt;span class="c"&gt;// Acquire the *sql.DB instance&lt;/span&gt;
    &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;sql&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"postgres"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;connectString&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatalf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Could not open database connection: %s"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c"&gt;// ping once to ensure the database connection is working&lt;/span&gt;
    &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Ping&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatalf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Could not ping database: %s"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This little bit of Go code sets up our Postgres connection and makes a single Ping to the database to ensure that everything is setup correctly. Now, we have a working db instance which in itself has many connection pools abstracted away that make concurrently querying and writing to a database a breeze. We don’t have to manage those connection pools ourselves; we get all that for free through the magic of Go’s pq library!&lt;/p&gt;

&lt;h3&gt;
  
  
  The brute force approach
&lt;/h3&gt;

&lt;p&gt;When first written, the &lt;code&gt;pizza&lt;/code&gt; micro-service would process each individual piece of data one row at a time. Here’s a very arbitrary example that demonstrates inserting data values one at a time into a Postgres database:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Exec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"INSERT INTO my_table(my_data) VALUES($1)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is essentially a raw, brute force approach.&lt;/p&gt;

&lt;p&gt;Round trip inserts into the database for all data members becomes an O(n) operation, which, depending on network latency and the power of your Postgres database, can quickly become a massive bottleneck. Even on a &lt;code&gt;localhost&lt;/code&gt; network where network latency can generally be ignored, with a hunk of data containing many thousands of entries, these inserts can take several milliseconds each which adds up very quickly.&lt;/p&gt;

&lt;h3&gt;
  
  
  Just make it parallel!?
&lt;/h3&gt;

&lt;p&gt;In theory, if you never really needed to handle conflicts within the database or elegantly surface errors, making the whole process parallel may work just fine:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;go&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Exec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"INSERT INTO my_table(my_data) VALUES($1)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}(&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here we are doing the same thing as the brute force approach but we’re firing off a new thread each time via a Go routine.&lt;br&gt;
While you may see marginal performance improvements (depending on the system and the number of cores in the machine’s processor that correspond to the number of possible threads going at once), this still requires O(n) inserts into the database and can quickly throttle the pool of connections available in the &lt;code&gt;*sql.DB&lt;/code&gt; we are using. And again, this doesn’t do a great job of handling multiple inserts that may conflict and ignores errors entirely. In other words, going with a parallel solution may seem like the ideal quickfix, but in reality, it may create more problems down the road.&lt;br&gt;
So, generally, this approach isn’t recommended.&lt;/p&gt;
&lt;h3&gt;
  
  
  Using &lt;code&gt;CopyIn&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Thankfully, Postgres and the pq library offer powerful “transaction” paradigms that make it easy to batch massive sets of data all at once. If this was raw SQL, we’d be using the COPY FROM keywords to mass drop in data from a “file” directly into a table. All in one statement. Go’s pq library abstracts all that using the &lt;code&gt;CopyIn&lt;/code&gt; method and allows for large batching operations.&lt;/p&gt;

&lt;p&gt;Let’s take a quick look at how you would implement this and how it works:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Start a psql transaction.&lt;/span&gt;
&lt;span class="n"&gt;txn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Begin&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatalf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Could not start psql transaction: %s"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// Make a "statement" to use for the psql transaction. The "CopyIn" takes&lt;/span&gt;
&lt;span class="c"&gt;// our table name and the columns we are coping into.&lt;/span&gt;
&lt;span class="c"&gt;//&lt;/span&gt;
&lt;span class="c"&gt;// The error handling will rollback the transaction if there's a&lt;/span&gt;
&lt;span class="c"&gt;// problem with preparing the statement.&lt;/span&gt;
&lt;span class="n"&gt;stmt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;txn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Prepare&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pq&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CopyIn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"my_table"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"my_data"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;txn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Rollback&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatalf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Could not prepare psql statement: %s"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// Iterate the data and add the data to the psql statement&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;stmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Exec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatalf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Could not execute the statement: %s"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// Execute, commit, and close the transaction&lt;/span&gt;
&lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;stmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatalf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Could not close the psql statement: %s"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;txn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatalf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Could not commit the psql transaction: %s"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All in all, this takes our number of round trips to the database from O(n) to just O(1) with a constant, predictable number of Postgres statements that will be executed. Much more efficient!&lt;/p&gt;

&lt;p&gt;What about conflicts with unique constraints?&lt;br&gt;
Taking all the data wholesale works fine if you can be relatively assured that there won’t ever be conflicts within it. But as soon as one of the rows you’re copying into has a unique identifier or some other unique constraint, you’ll run into major problems. For example, let’s say we’re processing a batch of emails and those emails being inserted into the database should all be unique: the above approach will fail as soon as a duplicate email is processed.&lt;/p&gt;

&lt;p&gt;Unfortunately, the CopyIn approach we’re using doesn’t have a way to handle conflicts directly. We need a different way:&lt;br&gt;
Enter the temporary table! Postgres offers a pretty powerful way to take a temporary table and pivot it into your real data tables, all while giving you the ability to handle conflicts. We’ll use a similar approach as above, but instead of adding everything to the real my_table, we’ll first create a temporary table to insert the data into:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;tmpTableName&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="s"&gt;"my_tmp_table"&lt;/span&gt;

&lt;span class="c"&gt;// Create a temporary table and use the real table as a template.&lt;/span&gt;
&lt;span class="c"&gt;// "WHERE 1=0" is a trick to select no rows in psql but still copy 1 for 1&lt;/span&gt;
&lt;span class="c"&gt;// all the data column types and names from the real table.&lt;/span&gt;
&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Exec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"CREATE TEMPORARY TABLE %s AS SELECT * FROM my_table WHERE 1=0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tmpTableName&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;


&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatalf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Could not create temporary table: %s"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now that we have a temporary table, we can use that in our CopyIn to do a mass insert:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Start a psql transaction.&lt;/span&gt;
&lt;span class="n"&gt;txn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Begin&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatalf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Could not start psql transaction: %s"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// Make a "statement" to use for the psql transaction.&lt;/span&gt;
&lt;span class="c"&gt;// Notice the "my_tmp_table" as the table name&lt;/span&gt;
&lt;span class="n"&gt;stmt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;txn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Prepare&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pq&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CopyIn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"my_tmp_table"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"my_data"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;txn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Rollback&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatalf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Could not prepare psql statement: %s"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// Iterate the data, add the data to the psql statement&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;stmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Exec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// Execute, commit, and close the transaction&lt;/span&gt;
&lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;stmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatalf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Could not close the psql statement: %s"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;txn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatalf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Could not commit the psql transaction: %s"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At this point, our temporary table has all the data: the table was created, the statement prepared, each data item added to the statement, and the transaction was committed.&lt;br&gt;
Now, we can attempt to pivot the data from the temporary table into the real table, handling conflicts along the way:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Exec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;`
    INSERT INTO my_table(my_data)
    SELECT my_data FROM my_tmp_table
    ON CONFLICT (my_data)
    DO NOTHING
`&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatalf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Could not pivot temporary table data: %s"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// Drop the temporary table now that we're done pivoting the data&lt;/span&gt;
&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Exec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"DROP TABLE %s"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tmpTableName&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatalf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Could not drop temporary table: %s"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In our example here, we use the temporary table’s data to mass insert into the real table. We avoid conflicts by doing nothing, dropping the conflicting data point. In a real world circumstance, you may want to do something with that data: the &lt;code&gt;ON CONFLICT&lt;/code&gt; handler is really powerful and there’s alot of stuff you can do with it in psql.&lt;/p&gt;

&lt;h3&gt;
  
  
  Table name clashes
&lt;/h3&gt;

&lt;p&gt;If you’re running the temporary table pivot on a server that handles many requests at scale concurrently, the obvious problem that will arise is clashes with a static temporary table name. Since we create the temporary table upon request and then drop it once we’re done, other threads may still be using it for operations of their own because the table name is not unique.&lt;/p&gt;

&lt;p&gt;There are alot of methods for handling temporary table name clashes but an arbitrary one that is a good place to get started is to use a unique identifier:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;rawUUID&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;uuid&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;uuid&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;strings&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ReplaceAll&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rawUUID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"-"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;tmpTableName&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"temp_table_%s_%d"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;atomic&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AddInt64&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;counter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This uses the github.com/google/uuid library to generate a UUID and replaces “-” with empty strings (since typically, dashes “-” are not valid within Postgres table names). We also combine this with a Go atomic counter (that is thread safe) in order to generate a unique table name: since these tables are short lived, individual uuid clashes are extremely unlikely, and we’re using an atomic counter to wrap it all up, the likelihood of a table name clash is nearly 0 using this basic approach.&lt;/p&gt;

&lt;p&gt;If you’re going to horizontally scale out your service to many additional instances, it may be advantageous to develop an orchestration method to ensure there are no conflicts with temporary table names across your scaled deployment.&lt;/p&gt;

&lt;p&gt;Overall, using batch inserts and table pivots in Postgres are a really powerful way to optimize your Go backends. Compared to the arbitrary, brute force approach, we found that this generally improved performance 24x. When processing a git repository with over 30,000 commits, using the standard “one by one” approach, processing would take about 1 minute. But, using the batch approach laid out above, this now only takes about 3 seconds. Wow! What an improvement!!&lt;/p&gt;

&lt;p&gt;If you’re interested in diving in deeper on these methodologies and how we implemented them at OpenSauced, check out the original PR for this here!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://insights.opensauced.pizza/feed/471"&gt;https://insights.opensauced.pizza/feed/471&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Stay saucy friends!!&lt;/p&gt;

</description>
      <category>go</category>
      <category>opensource</category>
      <category>programming</category>
      <category>postgres</category>
    </item>
    <item>
      <title>There is no secure software supply-chain</title>
      <dc:creator>John McBride</dc:creator>
      <pubDate>Sun, 03 Sep 2023 17:03:58 +0000</pubDate>
      <link>https://dev.to/jpmcb/there-is-no-secure-software-supply-chain-244m</link>
      <guid>https://dev.to/jpmcb/there-is-no-secure-software-supply-chain-244m</guid>
      <description>&lt;p&gt;Years ago, entrepreneurs and innovators predicated that &lt;a href="https://a16z.com/2011/08/20/why-software-is-eating-the-world/"&gt;“software would eat the world”.&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And to little surprise, year after year, the world has become more and more reliant on software solutions. Often times, that software is (or indirectly depends on) some open source software, maintained by a group of people whose only affiliation to one another may be participation in that open source project’s community.&lt;/p&gt;

&lt;p&gt;But we’re in trouble. The security of open source software is under threat and we’re running out of people to reliably maintain those projects. And as our stacks get deeper, our dependencies become more interlinked, leading to terrifying compromises in the secure software supply-chain. For a perfect example of what’s happening in the open source world right now, we don’t need to look much further than the extremely popular &lt;a href="https://github.com/orgs/gorilla/repositories"&gt;Gorilla toolkit for Go.&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In December of 2022, Gorilla was archived, a project that provided powerful web framework technology like mux and sessions. Over its lengthy tenure, it was the de facto Go framework for web servers, routing requests, handling HTTP traffic, and using websockets. It was used by tens of thousands of other software packages and it came as a shock to most people in the Go community that the project would be no more; no longer maintained, no more releases, and no community support. But for anyone paying close enough attention, the signs of turmoil were clear: &lt;a href="https://github.com/gorilla/websocket/issues/370"&gt;open calls for maintainers&lt;/a&gt; went unanswered, there were few active outside contributors, and the burden of maintainership was very heavy.&lt;/p&gt;

&lt;p&gt;The Gorilla framework was one of those “important dependencies”. It sat at the critical intersection of providing nice quality of life tools while still securely handling important payloads. Developers would mold their logic around the APIs provided by Gorilla and entire codebases would be shaped by the use of the framework. The community at large trusted Gorilla; the last thing you want in your server is a web framework riddled with bugs and CVEs. In the secure software supply-chain, much like Nginx and OpenSSL, it’s a project that was at the cornerstone of many other supply-chains and dependencies. If something went wrong in the Gorilla framework, it had the potential to impact millions of servers, services, and other projects.&lt;/p&gt;

&lt;p&gt;The secure software supply-chain is one of those abstract concepts that giant tech companies, security firms, and news outlets all love to buzz wording about. It’s the “idea” that the software you are consuming as a dependency, all the way through your stack, is exactly the software you’re expecting to consume. In other words, it’s the assurance that some hacker didn’t inject a backdoor into a library or build tool you use, compromising your entire product, software library, or even company. Supply-chain attacks are mischievous because they almost never go after the actual intended target. Instead, they compromise some dependency to then go after the intended target.&lt;/p&gt;

&lt;p&gt;The classic example, still to this day, is &lt;a href="https://www.gao.gov/blog/solarwinds-cyberattack-demands-significant-federal-and-private-sector-response-infographic"&gt;the Solar Winds attack:&lt;/a&gt; some unnamed, Russian state-backed hacker group was able to compromise the internal Solar Winds build system, leaving any subsequent software built using that system injected with backdoors and exploits. &lt;a href="https://www.nytimes.com/2020/12/14/us/politics/russia-hack-nsa-homeland-security-pentagon.html"&gt;The fallout from this attack was massive.&lt;/a&gt; Many government agencies, including the State Department, confirmed massive data breaches. The estimated cost of this attack continues to rise and &lt;a href="https://www.nytimes.com/2020/12/16/us/politics/russia-hack-putin-trump-biden.html"&gt;is estimated to be in the billions of dollars.&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Product after product have popped up in the last few years to try and solve these problems: software signing solutions, automated security scanning tools, up to date CVE databases, automation bots, AI assisted coding tools, etc. There was even a whole Whitehouse counsel on the subject. The federal government knows this is the most important (and most critically vulnerable) vector to the well being of our nation’s software infrastructure and they’ve been taking direct action to fight these kind of attacks.&lt;/p&gt;

&lt;p&gt;But the secure software supply-chain is also one of those things that falls apart quickly; without delicate handling and meticulous safeguarding, things go south fast. For months, the Gorilla toolkit had an open call for maintainers, seeking additional people to keep its codebases up to date, secure, and well maintained. But in the end, the Gorilla maintainers couldn’t find enough people to keep the project afloat. Many people volunteered but then were never seen again. &lt;a href="https://github.com/gorilla#gorilla-toolkit"&gt;And the bar for maintainer-ship was rightfully very high:&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;just handing the reins of even a single software package that has north of 13k unique clones a week (mux) is just not something I’d ever be comfortable with. This has tended to play out poorly with other projects.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;And in the past, this has played out poorly in other projects:&lt;/p&gt;

&lt;p&gt;In 2018, GitHub user FallingSnow opened &lt;a href="https://github.com/dominictarr/event-stream/issues/116"&gt;the issue “I don’t know what to say.”&lt;/a&gt; in the popular, but somewhat unknown, NPM JavaScript package event-stream. He'd found something very peculiar in recent commits to the library. A new maintainer, not seen in the community before, with what appeared to be an entirely new GitHub account, had committed a strange piece of code directly to the main branch. This unknown new maintainer had also cut a new package to the NPM registry, forcing this change onto anyone tracking the latest packages in their project.&lt;/p&gt;

&lt;p&gt;The changes looked like this: In a new file, a long inline encrypted string was added. The string would be decoded using some unknown environment variable, and then, that unencrypted string would be injected as a JavaScript module into the package, effectively executing whatever code was hidden behind the encrypted string. In short, unknown code was being deciphered, injected, and executed at runtime.&lt;/p&gt;

&lt;p&gt;The GitHub issue went viral. And through sheer brute force, abit of luck, and hundreds of commenters, the community was able to decrypt the string, revealing the injected code’s purpose: a crypto-currency “wallet stealer”. If the code detected a specific wallet on the system, it used a known exploit to steal all the crypto stored in that wallet.&lt;/p&gt;

&lt;p&gt;This exploitative code lived in the event-stream NPM module for months. Going undetected by security scanners, consumers, and the project’s owner. Only when someone in the community who was curious enough to take a look did this obvious code-injection attack become clear. But what made this attack especially bad was that the event-stream module was used by many other modules (and those modules used by other modules, and so on). In theory, this potentially affected thousands of software packages and millions of end-users. Developers who had no idea their JavaScript used event-stream deep in their dependency stack were now suddenly having to quickly patch their code. How was this even possible? Who approved and allowed this to happen?&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/dominictarr/event-stream/issues/116#issuecomment-440927400"&gt;The owner of the GitHub repository, and original author of the code, said:&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;he emailed me and said he wanted to maintain the module, so I gave it to him. I don't get any thing from maintaining this module, and I don't even use it anymore, and havn't for years.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;and&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;note: I no longer have publish rights to this module on npm.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Just like that, just by asking, some bad actor was able to compromise tens of thousands of software packages, going undetected through the veil of “maintainership”.&lt;/p&gt;

&lt;p&gt;In the past, I’ve referred to this as “The Risks of Single Maintainer Dependencies”: the overwhelming, often lonely, and sometimes dangerous experience of maintaining a widely distributed software package on your own. Like the owner of event-stream, most solo maintainers drift away, fading into the background to let their software go into disarray.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/gorilla#gorilla-toolkit"&gt;This was the case with Gorilla:&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The original author and maintainer, moraes, had moved on a long time ago. kisielk and garyburd had the longest run, maintaining a mix of the HTTP libraries and gorilla/websocket respectively. I (elithrar) got involved sometime in 2014 or so, when I noticed kisielk doing a lot of the heavy lifting and wanted to help contribute back to the libraries I was using for a number of personal projects. Since about ~2018 or so, I was the (mostly) sole maintainer of everything but websocket, which is about the same time garyburd put out an (effectively unsuccessful) call for new maintainers there too.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The secure software supply-chain will never truly be strong and secure as long as a single solo maintainer is able to disrupt an entire ecosystem of packages by giving their package away to some bad actor. In truth, there is no secure software supply-chain: we are only as strong as the weakest among us and too often, those weak links in the chain are already broken, left to rot, or given up to those with nefarious purposes.&lt;/p&gt;

&lt;p&gt;Whenever I bring up this topic, someone always asks about money. Oh, money, life’s truest satisfaction! And yes! Money can be a powerful motivator for some people. But it’s a sad excuse for what the secure software supply-chain really needs: true reliability. The software industry can throw all the money it wants at maintainers of important open source projects, &lt;a href="https://www.theverge.com/23499215/valve-steam-deck-interview-late-2022"&gt;something Valve has started doing:&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Griffais says the company is also directly paying more than 100 open-source developers to work on the Proton compatibility layer, the Mesa graphics driver, and Vulkan, among other tasks like Steam for Linux and Chromebooks.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;but at some point, it becomes unreasonable to ask just a handful of people to hold up the integrity, security, and viability of your companies entire product stack. If it’s that important, why not hire some of those people, build a team of maintainers, create processes for contribution, and allocate developer time into the open source? Too often I hear about solving open source problems by just throwing money at it, but at some point, the problems of scaling software delivery outweigh any amount you can possibly pay a few people. Let’s say you were building a house, it might make sense to have one or two people work on the foundation. But if you’re zoning and building an entire city block, I’d sure hope you’d put an entire team on planning, building, and maintaining those foundations. No amount of money will make just a few people build a strong and safe foundation all by themselves. But what we’re asking some open source maintainers to do is to plan, build, and coordinate the foundations for an entire world.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/gorilla#gorilla-toolkit"&gt;And this is something the Gorilla maintainers recognized as well:&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;No. I don’t think any of us were after money here. The Gorilla Toolkit was, looking back at the most active maintainers, a passion project. We didn’t want it to be a job.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;For them, it wasn’t about the money, so throwing any amount at the project wouldn’t have helped. It was about the software’s quality, maintainability, and the kind of intrinsic satisfaction it provided.&lt;/p&gt;

&lt;p&gt;So then, how can we incentivize open source maintainers to maintain their software in a scalable, realistic way? Some people are motivated by the altruistic value they provide to a community. Some are motivated by fame, power, and recognition. Others still just want to have fun and work on something cool. It’s impossible to understand the complicated, interlinked way&lt;br&gt;
different people in an open source community are all motivated. Instead, the best solution is obvious: If you are on a team that relies on some piece of open source software, allocate real engineering time to contributing, being apart of the community, and helping maintain that software. Eventually, you’ll get a really good sense of how a project operates and what motivates its main players. And better yet, you’ll help alleviate the heavy burden of solo maintainership.&lt;/p&gt;

&lt;p&gt;Sometimes, I like to think of software like its a wooden canoe, its many dependencies making up the wooden strips of the boat. When first built, it seems sturdy, strong, and able to withstand the harshest of conditions. Its first coat of oil finish is fresh and beautiful, its wood grains smooth and&lt;br&gt;
unbent. But as the years ware on, eventually, its finish fads, its wooden strips need replacing, and maybe, if it takes on water, it requires time and new material to repair. Neglected long enough, and its wood could mold and rot from the inside, completely compromising the integrity of the boat. And just like a boat, software requires time, energy, maintenance, and “hands-on-deck” to ensure its many links in the secure software supply-chain are strong. Otherwise, the termites of time and the rot of bad-actors weaken links in the chain, compromising the stability of it all.&lt;/p&gt;

&lt;p&gt;In the end, the maintainers of the Gorilla framework did the right thing: they decommissioned a widely used project that was at risk of rotting from the inside out. And instead of let it live in disarray or potentially fall into the hands of bad actors, it is simply gone. Its link on the chain of software has been purposefully broken to force anyone using it to choose a better, and hopefully, more secure option.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;I do believe that open source software is entitled to a lifecycle — a beginning, a middle, and an end — and that no project is required to live on forever. That may not make everyone happy, but such is life.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;But earlier this year, people in the Gorilla community noticed something: a new group of individuals from Red Hat had been added as maintainers to the Gorilla GitHub org. Was Red Hat taking the projected over? No, but ironically, the emeritus maintainers had done exactly what they promised they would never do: at the 11th hour, they handed over the project to people with little vetting from the community.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;To address many comments that we have seen - we would like to clarify that Red Hat is not taking over this project. While the new Core Maintainers all happen to work at Red Hat, our hope is that developers from many different organizations and backgrounds will join the project over time.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Maybe Gorilla was too important to drift slowly into obscurity and Red Hat rightfully allocated some engineering resources to the project. Gorilla lives on. Here's hoping the code is in good hands.&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>go</category>
    </item>
    <item>
      <title>Prestige Over Influence: Choosing A More Impactful Online Presence</title>
      <dc:creator>John McBride</dc:creator>
      <pubDate>Fri, 11 Aug 2023 21:54:24 +0000</pubDate>
      <link>https://dev.to/jpmcb/prestige-over-influence-choosing-a-more-impactful-online-presence-5cje</link>
      <guid>https://dev.to/jpmcb/prestige-over-influence-choosing-a-more-impactful-online-presence-5cje</guid>
      <description>&lt;p&gt;The world of software engineering influencers, what I typically like to refer to as “tech-fluencers”, has grown significantly in the last few years. There are people who have built entire personal brands and businesses solely on the basis of their online tech content. And many massive technology companies now participate in the same spheres that 5 years ago would have been unheard of (just think about all the memes major tech companies have created in the last few years).&lt;/p&gt;

&lt;p&gt;And with the rise of platforms that promote short form video content, like TikTok and YouTube shorts, it’s now easier then ever to build branding and create a catalog of niche content designed to fulfill a void somewhere out there on the internet.&lt;/p&gt;

&lt;p&gt;But I’ve seen a big problem with all of this.&lt;/p&gt;

&lt;p&gt;We often see others with significant reach in online tech spaces and assume that the only way to achieve that kind of corporate success, financial well-being, confidence, seniority status, or whatever else their persona amplifies, is to emulate them and make content to also achieve that reach, success, and influence in the industry.&lt;/p&gt;

&lt;p&gt;From my first hand experience, this is simply not true.&lt;/p&gt;

&lt;p&gt;Years ago, I fell into the mental trap of creating tech content online: partly out of boerdum during the pandemic and partly because I was looking for new ways to level up my career. I thought that creating content online, like I saw so many other people doing, would be an accelerator for me. I started a TikTok account. During it’s heyday, the account reached over 140 thousands followers. This lead to a YouTube channel, a Twitch stream, daily content generation, and much more.&lt;/p&gt;

&lt;p&gt;And honestly, after hundreds and hundreds of videos, none of it really sticks out as actually being significant to my career. After all, most of it was fluff and memes without alot of sustenance.&lt;/p&gt;

&lt;p&gt;This is the trap of content creation that is all too tantalizing: maybe start with pure intent but eventually find yourself feeding the algorithms a never ending stream of content for the hopes of achieving some amorphous goal that has bastardized into something you don’t recognize anymore.&lt;/p&gt;

&lt;p&gt;I eventually took a big step back from the content creator grind and ultimately felt pretty disappointed in what seemed like a huge wasted effort.&lt;/p&gt;

&lt;p&gt;I think Will Larson sums this all up incredibly well in his piece &lt;a href="https://lethain.com/tech-influencer/"&gt;“How to be a tech influencer”&lt;/a&gt;. He says:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Most successful people are not well-known online.&lt;/em&gt; If you participate frequently within social media, it’s easy to get sucked into the reality distortion field it creates. Being well-known in an online community feels equivalent to professional credibility when you’re spending a lot of time in that community. My experience is that very few of the most successful folks I know are well-known online, and many of the most successful folks I know don’t create content online at all.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Instead, there is an alternative approach: prestige.&lt;/p&gt;

&lt;p&gt;Building a long term, successful tech career is not about having large followings in online tech spaces or massive engagement on content. Chasing those metrics will only lead you down that road of churning out content for the sake of staying relevant in whatever algorithm you’re participating in.&lt;/p&gt;

&lt;p&gt;No, one of the many puzzle pieces in building a fruitful tech career involves building prestige.&lt;/p&gt;

&lt;p&gt;Prestige is the “idea” of someone and is based on the respect for the things achieved, battles won, and quality of their character.&lt;/p&gt;

&lt;p&gt;When I was at AWS, I could tell who the prestigious engineers were based on the way other people talked about them, how others approached that person’s code, and how that person could command a room. Prestige is easy to see, difficult to measure, and illusive to obtain.&lt;/p&gt;

&lt;p&gt;Don’t be mistaken: you may read that and assume prestige and fear are close neighbors. But prestige is not about control, making others do what you want, or power. Prestige on one hand is about gaining other’s respect. But on the other, it’s about having self respect, owning your mistakes, being humble, kindness, and above all, keeping yourself accountable to the high bar of quality and character that you hold for yourself.&lt;/p&gt;

&lt;p&gt;Measuring your prestige is much more difficult than tracking your influence. It’s easy to see the number of followers on your online accounts go up, but tracking the respect and repute people have for you is a whole different challenge.&lt;/p&gt;

&lt;p&gt;This can make attempting to generate prestige difficult. How can I drum up respect and prestige for myself across the industry if I can’t really measure it effectively?&lt;/p&gt;

&lt;p&gt;Ironically, generating prestige with online content can be a very successful way to go about amplifying your existing reputation. Experimenting with different forms of content and distribution models is important, but I want to stress that creating content to amplify your prestige should not be the same as content creation (at least in the typical, 2023 sense). You should not fall prey to the temptations of algorithms designed to steal your attention and sap your creative energy. You should simply use them as a tool of distribution if necessary.&lt;/p&gt;

&lt;p&gt;But more importantly, the quality of your content matters significantly more than the quantity. Typical social media influence dictates that you must post on a regular schedule. But for the engineering leader looking to grow their prestige, one or two extremely high quality pieces go a very very long way. It’s not necessary that you always be chugging out content since relevance in typical social media algorithms should not be your end goal.&lt;/p&gt;

&lt;p&gt;So, how do you actually go about building prestige? Here are my 5 approaches to growing your prestige within your engineering organization and online:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Invent
&lt;/h3&gt;

&lt;p&gt;You should be finding ways to solve big technical problems that have increasing impact and that grow your status within the engineering org.&lt;/p&gt;

&lt;p&gt;This should really be the prerequisite to building any sort of prestige. But it may not be obvious to all: it can be easy to get stuck in a loop of finishing tickets and completing all your tasks during a sprint without expanding into more challenging territories.&lt;/p&gt;

&lt;p&gt;But if you’re not finding technical problems to solve that require innovation, expertise, and abit of the inventors mindset, you’ll eventually hit a career ceiling.&lt;/p&gt;

&lt;p&gt;It is possible to build prestige without inventing. You can get pretty good at taking credit for others work or faking it till you make it. But eventually, this catches up with you and you reach a point where your persona is hollow and it’s clear the achievements where your reputation is build upon can’t be trusted or respected.&lt;/p&gt;

&lt;p&gt;Inventing, building, and solving increasingly challenging technical problems is the backbone of building any kind of technical prestige.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Newsletters
&lt;/h3&gt;

&lt;p&gt;Internal newsletters to your organization are a great way to communicate what you’re doing, what you’ve invented, and brag abit about some of your technical achievements.&lt;/p&gt;

&lt;p&gt;For some, this may seem too out of reach. Aren’t these types of newsletters within my company only for VPs and engineering leaders?&lt;/p&gt;

&lt;p&gt;Not necessarily. An opt-in type newsletter is the best place to start (i.e. don’t start a newsletter and send it to everyone in the company). Your manager and other teammates will likely want to opt in. After all, why wouldn’t they want a regular email of what you’ve been working on, things that interest you, and pieces of work you’re particularly proud of that week?&lt;/p&gt;

&lt;p&gt;Newsletters are also a great habit to be in since they force you to quantify and qualify your work on a regular cadence which can then be translated latter into talks, deep dives, promotion documents, or other content that you can share with your org or the wider world.&lt;/p&gt;

&lt;p&gt;Some people take this to the next level and publish a public newsletter. This can be a really cool avenue for those working “in public” and can be a great way to start connecting with other technical leaders out in the industry.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Talks
&lt;/h3&gt;

&lt;p&gt;Technical talks come in many different shapes and sizes. I would consider a “talk” to be anythying from showing something off during your team weekly demos all the way up to international keynotes at large conferences.&lt;/p&gt;

&lt;p&gt;The different ends of that spectrum obviously have different levels of reach and impact, but both help to establish you as a subject area expert in that thing you’re talking about. It’s an automatic way to gain some prestige about the topic and it’ll likely open you up to connecting with others in the audience that may lead to further opportunities (as the wheels of prestige go round)!&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Deep dives
&lt;/h3&gt;

&lt;p&gt;Technical deep dives also come in many shapes and forms. It may be a written piece (like this!), a video, a seminar, or really anything that can deeply communicate a technical topic.&lt;/p&gt;

&lt;p&gt;Deep dives are great for generating some prestige since they can be easily referenced latter. They sort of end up being a time machine for you to use and recycle in powerful ways. I’ve seen people take deep dives and turn them into conference talks, business pitches, and even entire products!&lt;/p&gt;

&lt;p&gt;But they are ultimately useful for establishing your expertise and prowess in a given technical matter.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Get others to talk about it
&lt;/h3&gt;

&lt;p&gt;The most powerful, and maybe most difficult avenue to building prestige, is to get other people to talk about you and your work. At this point, the wheels of prestige are fully turning and they will move on their own for a fair amount of time.&lt;/p&gt;

&lt;p&gt;Having a wealth of talks, deep dives, and newsletters ensures that other people (like your boss or your co-workers) have something to talk about.&lt;/p&gt;

&lt;p&gt;And remember, prestige holds you to the highest bar of quality. So at this point, regardless of how many years it’s been, you can be assured that if people are talking about you, discussing a talk you gave, or chatting about something you’ve achieved, you know that it’s something that you can be proud of and respect yourself for.&lt;/p&gt;




&lt;p&gt;Prestige is an incredible tool to build within your engineering organization and out in public. It should be a good approach for anyone looking to really leveling up their career. And in my experience, it’s a much preferred method to the typical “tech-fluencers” content grind.&lt;/p&gt;

</description>
      <category>career</category>
    </item>
  </channel>
</rss>
