<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Jintao Zhang</title>
    <description>The latest articles on DEV Community by Jintao Zhang (@zhangjintao).</description>
    <link>https://dev.to/zhangjintao</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F691707%2Fcd74d9d6-483c-42a4-a745-3d954978db19.png</url>
      <title>DEV Community: Jintao Zhang</title>
      <link>https://dev.to/zhangjintao</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/zhangjintao"/>
    <language>en</language>
    <item>
      <title>From Deprecated npm Classic Tokens to OIDC Trusted Publishing: A CI/CD Troubleshooting Journey</title>
      <dc:creator>Jintao Zhang</dc:creator>
      <pubDate>Sun, 04 Jan 2026 02:03:03 +0000</pubDate>
      <link>https://dev.to/zhangjintao/from-deprecated-npm-classic-tokens-to-oidc-trusted-publishing-a-cicd-troubleshooting-journey-4h8b</link>
      <guid>https://dev.to/zhangjintao/from-deprecated-npm-classic-tokens-to-oidc-trusted-publishing-a-cicd-troubleshooting-journey-4h8b</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;In January 2026, I encountered a series of cryptic authentication errors while publishing an npm package. This post documents the complete journey from problem discovery to final resolution—hopefully saving others from the same headaches.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Background
&lt;/h2&gt;

&lt;p&gt;I maintain an npm package called &lt;a href="https://www.npmjs.com/package/amp-acp" rel="noopener noreferrer"&gt;amp-acp&lt;/a&gt;, an adapter that bridges Amp Code to the Agent Client Protocol (ACP). The project uses GitHub Actions for automated releases: pushing a &lt;code&gt;v*&lt;/code&gt; tag triggers automatic publishing to npm and creates a GitHub Release.&lt;/p&gt;

&lt;p&gt;This workflow had been running smoothly—until late December 2025...&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;Starting with v0.3.1, every publish attempt failed. The GitHub Actions logs showed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;npm error code ENEEDAUTH
npm error need auth This command requires you to be logged in to https://registry.npmjs.org/
npm error need auth You need to authorize this machine using `npm adduser`
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Even more confusing was this warning:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;npm notice Security Notice: Classic tokens have been revoked. 
Granular tokens are now limited to 90 days and require 2FA by default. 
Update your CI/CD workflows to avoid disruption. 
Learn more https://gh.io/all-npm-classic-tokens-revoked
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Root Cause Analysis
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The End of npm Classic Tokens
&lt;/h3&gt;

&lt;p&gt;After investigation, I discovered that &lt;strong&gt;npm permanently deprecated all Classic Tokens on December 9, 2025&lt;/strong&gt;. According to the &lt;a href="https://github.blog/changelog/2025-12-09-npm-classic-tokens-revoked-session-based-auth-and-cli-token-management-now-available/" rel="noopener noreferrer"&gt;GitHub official announcement&lt;/a&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;All existing npm classic tokens have been permanently revoked&lt;/li&gt;
&lt;li&gt;Classic tokens can no longer be created or restored&lt;/li&gt;
&lt;li&gt;New Granular tokens have a maximum validity of 90 days and require 2FA by default&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This means &lt;strong&gt;the traditional approach of storing &lt;code&gt;NPM_TOKEN&lt;/code&gt; in GitHub Secrets is no longer viable&lt;/strong&gt; (at least not as convenient as before).&lt;/p&gt;

&lt;h3&gt;
  
  
  The New Authentication Method: OIDC Trusted Publishing
&lt;/h3&gt;

&lt;p&gt;npm's recommended solution is &lt;strong&gt;OIDC Trusted Publishing&lt;/strong&gt;. This OpenID Connect-based authentication mechanism offers several advantages:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;No token management&lt;/strong&gt; – No need to create, store, or rotate tokens&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enhanced security&lt;/strong&gt; – Uses short-lived, cryptographically signed, workflow-specific credentials&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automatic provenance&lt;/strong&gt; – Automatically generates provenance statements, providing build-origin transparency&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Industry standard&lt;/strong&gt; – Aligns with PyPI, RubyGems, crates.io, and other major package registries&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Troubleshooting Log
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Attempt 1: Upgrading npm Version
&lt;/h3&gt;

&lt;p&gt;Initially, I assumed the issue was an outdated npm version, so I added this to the workflow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Update npm to latest&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm install -g npm@latest&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result: Failed&lt;/strong&gt; ❌&lt;/p&gt;

&lt;h3&gt;
  
  
  Attempt 2: Removing registry-url
&lt;/h3&gt;

&lt;p&gt;Someone suggested removing the &lt;code&gt;registry-url&lt;/code&gt; parameter from &lt;code&gt;actions/setup-node&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/setup-node@v4&lt;/span&gt;
  &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;node-version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;22'&lt;/span&gt;
    &lt;span class="c1"&gt;# Removed registry-url&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result: Failed&lt;/strong&gt; ❌&lt;/p&gt;

&lt;h3&gt;
  
  
  Attempt 3: Setting NODE_AUTH_TOKEN to Empty String
&lt;/h3&gt;

&lt;p&gt;Based on some outdated resources, I tried setting &lt;code&gt;NODE_AUTH_TOKEN&lt;/code&gt; to an empty string:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Publish to npm&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm publish --access public&lt;/span&gt;
  &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;NODE_AUTH_TOKEN&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result: Failed&lt;/strong&gt; ❌&lt;/p&gt;

&lt;p&gt;Here's the critical misconception: setting an empty &lt;code&gt;NODE_AUTH_TOKEN&lt;/code&gt; actually &lt;strong&gt;prevents&lt;/strong&gt; OIDC from working, because npm attempts to use the empty token instead of OIDC.&lt;/p&gt;

&lt;h3&gt;
  
  
  Attempt 4: Completely Removing NODE_AUTH_TOKEN
&lt;/h3&gt;

&lt;p&gt;I finally realized that for OIDC Trusted Publishing, &lt;strong&gt;&lt;code&gt;NODE_AUTH_TOKEN&lt;/code&gt; should not be set at all&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Publish to npm&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm publish --access public&lt;/span&gt;
  &lt;span class="c1"&gt;# Note: no env section&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result: Partial success&lt;/strong&gt; ⚠️&lt;/p&gt;

&lt;p&gt;This time OIDC authentication started working (logs showed &lt;code&gt;Signed provenance statement&lt;/code&gt;), but a new error appeared:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;npm error 422 Unprocessable Entity - PUT https://registry.npmjs.org/amp-acp - 
Error verifying sigstore provenance bundle: Failed to validate repository information: 
package.json: "repository.url" is "", expected to match 
"https://github.com/tao12345666333/amp-acp" from provenance
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Attempt 5 (Final Success): Adding the repository Field
&lt;/h3&gt;

&lt;p&gt;It turns out npm's Provenance validation requires &lt;code&gt;package.json&lt;/code&gt; to include a &lt;code&gt;repository&lt;/code&gt; field matching the GitHub repository:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"amp-acp"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"0.3.7"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"repository"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"git"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://github.com/tao12345666333/amp-acp"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result: Success!&lt;/strong&gt; ✅&lt;/p&gt;

&lt;h2&gt;
  
  
  The Correct Configuration
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Configure Trusted Publisher on npmjs.com
&lt;/h3&gt;

&lt;p&gt;First, configure Trusted Publisher on the npm website:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Navigate to &lt;code&gt;https://www.npmjs.com/package/YOUR_PACKAGE/settings&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Find the "Trusted Publisher" section&lt;/li&gt;
&lt;li&gt;Select "GitHub Actions"&lt;/li&gt;
&lt;li&gt;Fill in the following:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Organization/User&lt;/strong&gt;: Your GitHub username or organization name&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Repository&lt;/strong&gt;: Your repository name&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Workflow filename&lt;/strong&gt;: The workflow file name (e.g., &lt;code&gt;release.yml&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Environment&lt;/strong&gt;: (Optional) If using GitHub Environments&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  2. GitHub Actions Workflow Configuration
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Release&lt;/span&gt;

&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;push&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;tags&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;v*'&lt;/span&gt;

&lt;span class="na"&gt;permissions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;id-token&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;write&lt;/span&gt;   &lt;span class="c1"&gt;# Required for OIDC authentication&lt;/span&gt;
  &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;write&lt;/span&gt;   &lt;span class="c1"&gt;# Required for creating GitHub Release&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;release&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Checkout&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Setup Node.js&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/setup-node@v4&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;node-version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;22'&lt;/span&gt;
          &lt;span class="na"&gt;registry-url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;https://registry.npmjs.org'&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Update npm to latest&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm install -g npm@latest&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Install dependencies&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm ci&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Publish to npm&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm publish --access public&lt;/span&gt;
        &lt;span class="c1"&gt;# Note: Do NOT set NODE_AUTH_TOKEN!&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Create GitHub Release&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;softprops/action-gh-release@v2&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;generate_release_notes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Required package.json Fields
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"your-package-name"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"x.y.z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"repository"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"git"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://github.com/YOUR_USERNAME/YOUR_REPO"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;npm Classic Tokens are dead&lt;/strong&gt; – As of December 9, 2025, all classic tokens are permanently invalidated&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;OIDC Trusted Publishing is the new standard&lt;/strong&gt; – No token management, enhanced security, built-in provenance&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Do not set NODE_AUTH_TOKEN&lt;/strong&gt; – For OIDC, this environment variable should not be set at all&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Configure Trusted Publisher on npmjs.com&lt;/strong&gt; – This step is often overlooked&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;package.json must include the repository field&lt;/strong&gt; – Required for provenance validation&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Ensure id-token: write permission&lt;/strong&gt; – Otherwise, OIDC token generation will fail&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;npm CLI version requirement&lt;/strong&gt; – Requires npm 11.5.1 or later&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Q: Can I use OIDC to publish the first version of a new package?
&lt;/h3&gt;

&lt;p&gt;A: No. The first version must be published manually or using a traditional token. Trusted Publisher can only be configured afterward.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: Can I use OIDC with self-hosted runners?
&lt;/h3&gt;

&lt;p&gt;A: Currently, only GitHub/GitLab-hosted runners are supported. Self-hosted runners are not yet supported.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: Why doesn't setting NODE_AUTH_TOKEN to an empty string work?
&lt;/h3&gt;

&lt;p&gt;A: An empty string is still a value—npm will attempt to use it rather than falling back to OIDC. Only when this variable is completely unset will npm automatically use OIDC.&lt;/p&gt;

&lt;h3&gt;
  
  
  Q: What should I do if provenance validation fails?
&lt;/h3&gt;

&lt;p&gt;A: Verify that &lt;code&gt;repository.url&lt;/code&gt; in &lt;code&gt;package.json&lt;/code&gt; exactly matches the GitHub repository URL (including case sensitivity).&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.npmjs.com/trusted-publishers" rel="noopener noreferrer"&gt;npm Trusted Publishing Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.blog/changelog/2025-12-09-npm-classic-tokens-revoked-session-based-auth-and-cli-token-management-now-available/" rel="noopener noreferrer"&gt;GitHub Changelog: npm classic tokens revoked&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.npmjs.com/generating-provenance-statements" rel="noopener noreferrer"&gt;npm Provenance Introduction&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Written on January 4, 2026, based on the publishing experience of amp-acp project from v0.3.1 to v0.3.7.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>npm</category>
      <category>githubactions</category>
      <category>cicd</category>
      <category>security</category>
    </item>
    <item>
      <title>Mastering Kubernetes Services and Ingress: A Comprehensive Guide to Deploying Applications with Ease</title>
      <dc:creator>Jintao Zhang</dc:creator>
      <pubDate>Sun, 12 Feb 2023 00:24:03 +0000</pubDate>
      <link>https://dev.to/zhangjintao/mastering-kubernetes-services-and-ingress-a-comprehensive-guide-to-deploying-applications-with-ease-4ai1</link>
      <guid>https://dev.to/zhangjintao/mastering-kubernetes-services-and-ingress-a-comprehensive-guide-to-deploying-applications-with-ease-4ai1</guid>
      <description>&lt;h2&gt;
  
  
  I. Introduction
&lt;/h2&gt;

&lt;p&gt;Kubernetes is an open-source platform for automating the deployment, scaling, and management of containerized applications. It has become the industry-standard for container orchestration and is widely used in production environments.&lt;/p&gt;

&lt;p&gt;In a Kubernetes cluster, there are many components that work together to make the deployment of applications seamless. Services and Ingress are two of these components that are essential for making applications accessible from the outside world. In this article, we'll explore the details of Kubernetes Services and Ingress, and demonstrate how to use them effectively.&lt;/p&gt;

&lt;h2&gt;
  
  
  II. Kubernetes Services
&lt;/h2&gt;

&lt;h3&gt;
  
  
  A. Overview of Kubernetes Services
&lt;/h3&gt;

&lt;p&gt;In Kubernetes, a Service is a resource that represents a set of pods running the same application. It provides stable network endpoints for pods, making it easier to route network traffic to them. The Service abstraction helps to hide the complexities of network topology and ensures that network traffic reaches the right pods, even if they move around the cluster.&lt;/p&gt;

&lt;h3&gt;
  
  
  B. Types of Services in Kubernetes
&lt;/h3&gt;

&lt;p&gt;Kubernetes offers several types of Services, including ClusterIP, NodePort, LoadBalancer, and ExternalName. The most commonly used Service type is ClusterIP, which provides a cluster-internal IP address that routes traffic to pods. The NodePort type exposes the Service on each node's IP address, while LoadBalancer creates a cloud-provider load balancer to route traffic to pods. The ExternalName type maps a Service to a DNS name, making it accessible from outside the cluster.&lt;/p&gt;

&lt;h3&gt;
  
  
  C. Service Discovery in Kubernetes
&lt;/h3&gt;

&lt;p&gt;Service discovery is the process of finding the network endpoint of a Service. In Kubernetes, Services can be discovered using their DNS name or through environment variables set by the kube-dns service. The DNS name of a Service is in the format of &lt;code&gt;&amp;lt;service-name&amp;gt;.&amp;lt;namespace&amp;gt;.svc.cluster.local&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  D. Creating and Managing Services in Kubernetes
&lt;/h3&gt;

&lt;p&gt;To create a Service in Kubernetes, you can use a YAML file that defines the Service resource, or use the &lt;code&gt;kubectl&lt;/code&gt; command-line tool. Once a Service is created, it can be managed with &lt;code&gt;kubectl&lt;/code&gt;, including modifying its properties or scaling it up or down.&lt;/p&gt;

&lt;h2&gt;
  
  
  III. Kubernetes Ingress
&lt;/h2&gt;

&lt;h3&gt;
  
  
  A. Explanation of Ingress in Kubernetes
&lt;/h3&gt;

&lt;p&gt;Ingress is a Kubernetes resource that allows inbound network traffic to reach Services in the cluster. It provides a single entry point for external traffic, which can then be redirected to the appropriate Service based on the URL path or hostname. This enables you to expose multiple Services on a single IP address, making it easier to manage and secure external access to your applications.&lt;/p&gt;

&lt;h3&gt;
  
  
  B. Ingress Controllers in Kubernetes
&lt;/h3&gt;

&lt;p&gt;An Ingress Controller is a component in the cluster that is responsible for implementing the rules defined in Ingress resources. There are several popular Ingress Controllers available, including Nginx, Traefik, and Istio. The choice of Ingress Controller depends on the needs of your deployment, such as performance, security, and extensibility.&lt;/p&gt;

&lt;h3&gt;
  
  
  C. Ingress Rules and Path Routing
&lt;/h3&gt;

&lt;p&gt;Ingress resources define rules that determine how incoming traffic is redirected to Services. These rules can include the URL path, hostname, and port, and they can also include additional settings such as SSL/TLS encryption and authentication. Ingress rules are defined in the YAML file for the Ingress resource, and they can be updated at any time.&lt;/p&gt;

&lt;h3&gt;
  
  
  D. Setting up SSL/TLS Encryption with Ingress
&lt;/h3&gt;

&lt;p&gt;Enabling SSL/TLS encryption for your applications is a best practice for security and privacy. With Ingress, you can easily set up SSL/TLS encryption by configuring the Ingress Controller to terminate SSL/TLS connections and then forwarding the traffic to the appropriate Service. This can be done by adding annotations to the Ingress resource or by configuring the Ingress Controller itself.&lt;/p&gt;

&lt;h3&gt;
  
  
  E. Creating and Managing Ingress Resources in Kubernetes
&lt;/h3&gt;

&lt;p&gt;Just like Services, Ingress resources can be created and managed using YAML files or the &lt;code&gt;kubectl&lt;/code&gt; command-line tool. Once an Ingress resource is created, the Ingress Controller in the cluster will automatically implement the rules defined in the resource. You can also update or delete the Ingress resource at any time to change the behavior of the Ingress Controller.&lt;/p&gt;

&lt;h2&gt;
  
  
  IV. Best Practices for Kubernetes Services and Ingress
&lt;/h2&gt;

&lt;h3&gt;
  
  
  A. Designing Scalable and Efficient Services and Ingress
&lt;/h3&gt;

&lt;p&gt;When designing your Services and Ingress, it's important to consider scalability and efficiency. This includes things like selecting the appropriate Service type, optimizing network traffic routing, and choosing the right Ingress Controller. By following best practices, you can ensure that your applications remain performant as they grow in size and complexity.&lt;/p&gt;

&lt;h2&gt;
  
  
  B. Securing Services and Ingress with Network Policies
&lt;/h2&gt;

&lt;p&gt;Securing your Services and Ingress is critical for protecting your applications and data. Kubernetes provides Network Policies, which allow you to control network access to Services and Pods. By using Network Policies, you can restrict incoming and outgoing network traffic and ensure that only trusted sources can access your applications.&lt;/p&gt;

&lt;h3&gt;
  
  
  C. Monitoring and Logging Services and Ingress
&lt;/h3&gt;

&lt;p&gt;Monitoring and logging are essential for understanding the behavior of your applications and troubleshooting issues. Kubernetes provides several tools for monitoring and logging, including the Kubernetes Dashboard, Prometheus, and ELK Stack. By setting up monitoring and logging, you can quickly detect and resolve problems with your Services and Ingress.&lt;/p&gt;

&lt;h2&gt;
  
  
  V. Conclusion
&lt;/h2&gt;

&lt;p&gt;In this article, we've covered the basics of Kubernetes Services and Ingress and shown you how to use them to deploy and manage your applications. Services provide stable network endpoints for pods, while Ingress provides a single entry point for external traffic. By using these resources together, you can build scalable, efficient, and secure applications in Kubernetes.&lt;/p&gt;

&lt;h2&gt;
  
  
  VI. References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Kubernetes official documentation: &lt;strong&gt;&lt;a href="https://kubernetes.io/docs/" rel="noopener noreferrer"&gt;https://kubernetes.io/docs/&lt;/a&gt;&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Kubernetes Services: &lt;strong&gt;&lt;a href="https://kubernetes.io/docs/concepts/services-networking/service/" rel="noopener noreferrer"&gt;https://kubernetes.io/docs/concepts/services-networking/service/&lt;/a&gt;&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Kubernetes Ingress: &lt;strong&gt;&lt;a href="https://kubernetes.io/docs/concepts/services-networking/ingress/" rel="noopener noreferrer"&gt;https://kubernetes.io/docs/concepts/services-networking/ingress/&lt;/a&gt;&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Kubernetes Network Policies: &lt;strong&gt;&lt;a href="https://kubernetes.io/docs/concepts/services-networking/network-policies/" rel="noopener noreferrer"&gt;https://kubernetes.io/docs/concepts/services-networking/network-policies/&lt;/a&gt;&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Kubernetes Monitoring and Logging: &lt;strong&gt;&lt;a href="https://kubernetes.io/docs/tasks/debug-application-cluster/logging-monitoring/" rel="noopener noreferrer"&gt;https://kubernetes.io/docs/tasks/debug-application-cluster/logging-monitoring/&lt;/a&gt;&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>discuss</category>
      <category>coding</category>
      <category>programmers</category>
    </item>
    <item>
      <title>20 tips for Prometheus Monitoring</title>
      <dc:creator>Jintao Zhang</dc:creator>
      <pubDate>Wed, 01 Feb 2023 10:37:10 +0000</pubDate>
      <link>https://dev.to/zhangjintao/20-tips-for-prometheus-monitoring-3i21</link>
      <guid>https://dev.to/zhangjintao/20-tips-for-prometheus-monitoring-3i21</guid>
      <description>&lt;p&gt;Prometheus is an open-source monitoring system and time series database that is widely used for monitoring and alerting. It's a powerful tool that can provide deep insights into the performance of your infrastructure and applications.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;In this article, I'll provide you with 20 tips for mastering Prometheus.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz9d3qois5esv6gnhlgk9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz9d3qois5esv6gnhlgk9.png" alt="use Prometheus monitoring" width="800" height="628"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;- Choose the Right Data Sources&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;One of the first things to consider when setting up Prometheus is to ensure that you are collecting the right metrics from the right sources. The metrics you collect should be relevant to your use case and provide meaningful insights into the performance of your systems.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;- Use Labels Effectively&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Labels are a powerful tool for organizing and grouping metrics in Prometheus. It's important to use them wisely to ensure that your metrics are easily searchable and queryable. Labels allow you to segment your metrics based on attributes such as environment, application, and host.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;- Utilize Built-In Functionalities&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Prometheus provides a range of built-in functionalities that you can use to perform complex queries and visualizations. PromQL is a powerful query language that allows you to search and aggregate metrics, while Promdash is a web-based dashboard that can be used to visualize your metrics.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;- Service Discovery&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Prometheus provides service discovery, which allows you to automatically discover and scrape metrics from targets. This feature can save you time and effort by eliminating the need to manually configure targets.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;- Use Alerts Wisely&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Alerts are an important feature of Prometheus, but it's important to be selective when setting them up. Ensure that your alerts are actionable and meaningful, and that they are not generating too many false positive or false negative alerts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;- Keep Your Data Fresh&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Prometheus relies on up-to-date metrics to provide meaningful insights into the performance of your systems. Ensure that your metrics are being updated frequently and that they are not stale.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;- Store Your Data Effectively&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Prometheus stores your metrics in a time series database, and it's important to store your data in a highly available and scalable backend. Options include local disk storage, remote write to a third-party database, and cloud-based storage solutions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;- Optimize Your Queries&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;PromQL is a powerful query language, but it's important to optimize your queries to improve performance and reduce load on your servers. Ensure that your queries are efficient and well-optimized.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;- Monitor Your Monitoring&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It's important to keep an eye on your Prometheus instances and their performance to ensure they are running smoothly. Regularly monitor your Prometheus instances to ensure they are functioning as expected.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;- Use Pushgateway for Short-Lived Jobs&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Pushgateway is a component of Prometheus that is designed to handle metrics from short-lived jobs, such as batch jobs and cron jobs. If you have short-lived jobs in your environment, consider using Pushgateway to collect and store their metrics.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;- Use Grafana for Visualization&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Grafana is a popular open-source dashboard solution that works well with Prometheus. It provides a range of visualization options and is easy to use. If you need to visualize your metrics, consider using Grafana.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;- Use Remote Write and Remote Read&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Remote Write and Remote Read are features in Prometheus that allow you to replicate data between Prometheus instances for high availability. If you need to ensure high availability for your metrics, consider using these features.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;- Use Recording Rules&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Recording rules allow you to pre-aggregate and reduce the amount of data stored in your backend. They can help to improve performance and reduce the load on your servers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;- Monitor Your Application and Infrastructure&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Prometheus is designed to monitor both your applications and infrastructure, so ensure that you are monitoring both to gain a complete picture of your systems. This can include metrics such as resource usage, network traffic, and application-specific metrics.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;- Use Exporters for Non-Prometheus Systems&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Prometheus works best with systems that have a Prometheus exporter, which can export metrics from non-Prometheus systems into Prometheus. Consider using exporters to integrate your existing systems into your Prometheus monitoring solution.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;- Manage Data Retention&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Prometheus provides options for managing data retention, such as setting the retention period and compaction rate. Ensure that you have appropriate settings in place to manage your data retention, as retaining too much data can consume disk space and negatively impact performance.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;- Use Alertmanager for Alerting&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Alertmanager is a component of Prometheus that provides advanced alerting functionality, such as routing, silencing, and aggregation. Consider using Alertmanager to manage your alerts, as it provides a more flexible and scalable solution compared to Prometheus' built-in alerting.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;- Monitor Your Exporters&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you are using exporters to integrate non-Prometheus systems into Prometheus, it's important to monitor the health of your exporters. Ensure that your exporters are running smoothly and that they are providing up-to-date metrics.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;- Consider a Scalable Monitoring Solution&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Prometheus is designed to be scalable, but it can become challenging to manage as your environment grows. Consider using a scalable monitoring solution, such as Thanos or Cortex, to provide a more scalable and flexible solution for your monitoring needs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;- Regularly Review Your Metrics&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Regularly review your metrics to ensure that they are providing meaningful insights into the performance of your systems. Ensure that your metrics are relevant, up-to-date, and well-organized, and that they are providing the information you need to make informed decisions.&lt;/p&gt;

&lt;p&gt;In conclusion, Prometheus is a powerful and flexible monitoring solution that provides deep insights into the performance of your systems. &lt;/p&gt;

&lt;p&gt;By following these tips, you can master Prometheus and make the most of its capabilities.&lt;/p&gt;

&lt;p&gt;If you are interested in my articles, please subscribe to my Newsletter!&lt;/p&gt;


&lt;div class="crayons-card c-embed text-styles text-styles--secondary"&gt;
      &lt;div class="c-embed__cover"&gt;
        &lt;a href="https://blog.moelove.info/newsletter" class="c-link s:max-w-50 align-middle" rel="noopener noreferrer"&gt;
          &lt;img alt="" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblog.moelove.info%2Fapi%2Fog%2Fhome%3Fog%3DeyJ0aXRsZSI6IkNsb3VkQ3JhZnRBSSUyMHdpdGglMjBKaW50YW8iLCJkb21haW4iOiJibG9nLm1vZWxvdmUuaW5mbyIsImlzVGVhbSI6dHJ1ZSwibWV0YSI6Ikt1YmVybmV0ZXMlMkMlMjBEb2NrZXIlMkMlMjBjb250YWluZXIlMkMlMjBlQlBGIiwiYXJ0aWNsZXMiOnsidG90YWxEb2N1bWVudHMiOjR9fQ%3D%3D" height="630" class="m-0" width="1200"&gt;
        &lt;/a&gt;
      &lt;/div&gt;
    &lt;div class="c-embed__body"&gt;
      &lt;h2 class="fs-xl lh-tight"&gt;
        &lt;a href="https://blog.moelove.info/newsletter" rel="noopener noreferrer" class="c-link"&gt;
          Newsletter | CloudCraftAI with Jintao
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;p class="truncate-at-3"&gt;
          Subscribe to CloudCraftAI with Jintao's newsletter.
        &lt;/p&gt;
      &lt;div class="color-secondary fs-s flex items-center"&gt;
          &lt;img alt="favicon" class="c-embed__favicon m-0 mr-2 radius-0" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.hashnode.com%2Fres%2Fhashnode%2Fimage%2Fupload%2Fv1611242173172%2FAOX1gE2jc.png" width="32" height="32"&gt;
        blog.moelove.info
      &lt;/div&gt;
    &lt;/div&gt;
&lt;/div&gt;


</description>
      <category>cryptocurrency</category>
      <category>crypto</category>
      <category>blockchain</category>
      <category>web3</category>
    </item>
    <item>
      <title>How to reduce the cost of GitHub Actions</title>
      <dc:creator>Jintao Zhang</dc:creator>
      <pubDate>Fri, 27 Jan 2023 02:08:55 +0000</pubDate>
      <link>https://dev.to/zhangjintao/how-to-reduce-the-cost-of-github-actions-2en5</link>
      <guid>https://dev.to/zhangjintao/how-to-reduce-the-cost-of-github-actions-2en5</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;I'll cover how to reduce the code of GitHub Actions, and give some advice.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;According to G2's &lt;a href="https://www.g2.com/categories/continuous-integration?tab=easiest_to_use" rel="noopener noreferrer"&gt;statistical report&lt;/a&gt;, GitHub Actions is the easiest-to-use CI/CD tool, and more and more people like it.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fixgm1g2usgvad3gpp9ki.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fixgm1g2usgvad3gpp9ki.png" alt="2023-01-24 08-12-54屏幕截图.png" width="800" height="254"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Since GitHub Actions is GitHub's native CI/CD tool, tens of thousands of Actions can be used directly in the marketplace, and it is free for public repositories. More and more projects are switching their CI tools to GitHub Actions.&lt;/p&gt;

&lt;p&gt;I also really like GitHub Actions and use it for almost all my GitHub-hosted repositories.&lt;/p&gt;

&lt;p&gt;But recently I was working on a project that hit the GitHub Actions quota limit. It took me some time to focus on its cost.&lt;/p&gt;

&lt;p&gt;&lt;iframe class="tweet-embed" id="tweet-1616077513125691399-636" src="https://platform.twitter.com/embed/Tweet.html?id=1616077513125691399"&gt;
&lt;/iframe&gt;

  // Detect dark theme
  var iframe = document.getElementById('tweet-1616077513125691399-636');
  if (document.body.className.includes('dark-theme')) {
    iframe.src = "https://platform.twitter.com/embed/Tweet.html?id=1616077513125691399&amp;amp;theme=dark"
  }



&lt;/p&gt;

&lt;h2&gt;
  
  
  Why is the quota exhausted?
&lt;/h2&gt;

&lt;p&gt;Recently I found an interesting project: &lt;a href="https://github.com/upptime/upptime" rel="noopener noreferrer"&gt;upptime/upptime: ⬆️ Free uptime monitor and status page powered by GitHub&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I want to try to use it to monitor some of the services I have developed and make a status page, this will involve some API configurations, and I don't want to make it public, so I forked the project into a private repository. After a simple configuration, it works fine.&lt;/p&gt;

&lt;p&gt;Since I wanted more data, I tweaked the CI scheduler configuration. Make these tasks run more frequently.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;workflowSchedule&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;graphs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;0&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*"&lt;/span&gt;
  &lt;span class="na"&gt;responseTime&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;0&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*"&lt;/span&gt;
  &lt;span class="na"&gt;staticSite&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;0&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*"&lt;/span&gt;
  &lt;span class="na"&gt;summary&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;0&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*"&lt;/span&gt;
  &lt;span class="na"&gt;updateTemplate&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;0&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*"&lt;/span&gt;
  &lt;span class="na"&gt;updates&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;0&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*"&lt;/span&gt;
  &lt;span class="na"&gt;uptime&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;*/5&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;According to the &lt;a href="https://docs.github.com/en/billing/managing-billing-for-github-actions/about-billing-for-github-actions" rel="noopener noreferrer"&gt;billing documentation for GitHub Actions&lt;/a&gt;, GitHub Actions for public repositories is Free, but there is a quota limit for private repositories.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;GitHub Actions usage is free for standard GitHub-hosted runners in public repositories, and for Self-hosted runners. For private repositories, each GitHub account receives a certain amount of free minutes and storage for use with GitHub-hosted runners, depending on the product used with the account. Any usage beyond the included amounts is controlled by spending limits.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Soon I received a quota reminder email from GitHub, reminding me that the quota was about to be used up.&lt;/p&gt;

&lt;p&gt;This got me thinking about how to solve it.&lt;/p&gt;
&lt;h2&gt;
  
  
  Cost of using GitHub Actions
&lt;/h2&gt;

&lt;p&gt;Making the repository public is the most straightforward way, but I explained above why it cannot be made public. I can only find other solutions.&lt;/p&gt;

&lt;p&gt;Paying for GitHub Actions is also a very straightforward solution.&lt;/p&gt;

&lt;p&gt;Before deciding to pay for it, I want to estimate the cost. GitHub provides a &lt;a href="https://github.com/pricing/calculator" rel="noopener noreferrer"&gt;Pricing Calculator&lt;/a&gt;, which can easily estimate costs.&lt;/p&gt;

&lt;p&gt;Since I modified the CI's scheduling configuration, the most frequently run tasks will run every 5 minutes.&lt;/p&gt;

&lt;p&gt;I used &lt;a href="https://meercode.io/" rel="noopener noreferrer"&gt;Meercode&lt;/a&gt; to collect the running data of GitHub Actions in this repository. It provides some dashboards by default:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy56nxw00i510mg2snh8w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy56nxw00i510mg2snh8w.png" alt="2023-01-25 11-29-12 screenshot.png" width="800" height="394"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It also allows users to customize it themselves. I created my dashboard. If you are interested in Meercode, please let me know in the comments.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqne55oyayf4pkdv5pqkh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqne55oyayf4pkdv5pqkh.png" alt="ci-dashboard.png" width="800" height="282"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As can be seen from the figure above, each task takes no more than 0.5 minutes, and there are no more than 12 tasks per hour. Using the price calculator, the approximate &lt;strong&gt;cost is $35 per month&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0idxcxl6cb10ltel6lej.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0idxcxl6cb10ltel6lej.png" alt="2023-01-25 11-25-13 screenshot.png" width="800" height="1128"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Ways to save costs
&lt;/h2&gt;

&lt;p&gt;Since my repository is mainly run uptime CI, it consumes few resources but has frequent tasks, so I wonder if I can save costs if I use a self-hosted runner.&lt;/p&gt;

&lt;p&gt;I compared the prices of 3 lower-priced cloud service providers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://www.civo.com/" rel="noopener noreferrer"&gt;Civo&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://www.digitalocean.com/pricing/droplets" rel="noopener noreferrer"&gt;DigitalOcean&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://www.vultr.com/pricing/" rel="noopener noreferrer"&gt;Vultr&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Among them, both Civo and Vultr provide 1C1G instances at $5/month, and DigitalOcean instances with the same specifications are priced at $6/month.&lt;/p&gt;

&lt;p&gt;I finally chose &lt;a href="https://www.civo.com/" rel="noopener noreferrer"&gt;Civo&lt;/a&gt;, which is a &lt;em&gt;cloud-native service provider&lt;/em&gt;, and there is an introduction on its homepage:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Transparent pricing from just $5 a month&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Civo provides a variety of services, such as Kubernetes (based on k3s), or compute instances.&lt;/p&gt;

&lt;p&gt;Among them, the instance specification of the &lt;em&gt;Extra Small&lt;/em&gt; type is 1C1G, and it has 1TB traffic, and if you choose the Kubernetes service, you do not need to pay for the control plane(same as Azure AKS). Even the larger specs look cheap.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo94vu12xbvfq8p1jtj8z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo94vu12xbvfq8p1jtj8z.png" alt="2023-01-25 17-04-35 screenshot.png" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I have tried using its Kubernetes service, and compute instance respectively, and they both work fine.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fosl81cpw3f1g2g0vipw8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fosl81cpw3f1g2g0vipw8.png" alt="2023-01-25 17-08-43 screenshot.png" width="800" height="333"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Using compute instances
&lt;/h3&gt;

&lt;p&gt;Deploying the GitHub Actions runner in a Linux compute instance is simple, just add it to the project &lt;code&gt;https://github.com/&amp;lt;Your name&amp;gt;/&amp;lt;Project name&amp;gt;/settings/actions/runners/new&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;There are complete deployment steps on this page, just follow the steps.&lt;/p&gt;

&lt;p&gt;My installation process is as follows:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;civo@polished-bush-99d8-1926a1:~&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;mkdir &lt;/span&gt;actions-runner &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;cd &lt;/span&gt;actions-runner
civo@polished-bush-99d8-1926a1:~/actions-runner&lt;span class="nv"&gt;$ &lt;/span&gt;curl &lt;span class="nt"&gt;-o&lt;/span&gt; actions-runner-linux-x64-2.301.1.tar.gz &lt;span class="nt"&gt;-L&lt;/span&gt; https://github.com/actions/runner/releases/download/v2.301.1/actions-runner-linux-x64-2.301.1.tar.gz
civo@polished-bush-99d8-1926a1:~/actions-runner&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"3ee9c3b83de642f919912e0594ee2601835518827da785d034c1163f8efdf907  actions-runner-linux-x64-2.301.1.tar.gz"&lt;/span&gt; | shasum &lt;span class="nt"&gt;-a&lt;/span&gt; 256 &lt;span class="nt"&gt;-c&lt;/span&gt;
actions-runner-linux-x64-2.301.1.tar.gz: OK                                                                     
civo@polished-bush-99d8-1926a1:~/actions-runner&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;tar &lt;/span&gt;xzf ./actions-runner-linux-x64-2.301.1.tar.gz              
civo@polished-bush-99d8-1926a1:~/actions-runner&lt;span class="nv"&gt;$ &lt;/span&gt;./config.sh &lt;span class="nt"&gt;--url&lt;/span&gt; https://github.com/MoeLove/monitoring &lt;span class="nt"&gt;--token&lt;/span&gt; &lt;span class="nv"&gt;$TOKEN&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;After the execution is complete, some files will be added to the current directory. Execute &lt;code&gt;./env.sh&lt;/code&gt; to start the GitHub Actions runner.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;civo@polished-bush-99d8-1926a1:~/actions-runner&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;ls
&lt;/span&gt;_diag  _work  actions-runner-linux-x64-2.301.1.tar.gz  bin  config.sh  env.sh  externals  run-helper.cmd.template  run-helper.sh  run-helper.sh.template  run.sh  safe_sleep.sh  svc.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;If you want to run stably in the background, you can execute &lt;code&gt;./svc.sh install&lt;/code&gt; to install the runner as a systemd service and manage its life cycle through systemd.&lt;/p&gt;
&lt;h3&gt;
  
  
  Using Kubernetes
&lt;/h3&gt;

&lt;p&gt;Civo does not charge for the Kubernetes control plane, but only for Worker Nodes. The advantage of using Kubernetes is that I can automatically scale up and down in the cluster, and I can easily run and create multiple runners for different projects.&lt;/p&gt;

&lt;p&gt;Since GitHub official has not provided to deploy a Self-hosted runner on Kubernetes, I used the &lt;a href="https://github.com/actions/actions-runner-controller" rel="noopener noreferrer"&gt;Actions Runner Controller (ARC)&lt;/a&gt; project, This project allows rapid deployment of Self-hosted runners through &lt;code&gt;Runner&lt;/code&gt; custom resources.&lt;/p&gt;

&lt;p&gt;The deployment process is clearly described in the &lt;a href="https://github.com/actions/actions-runner-controller/blob/master/docs/quickstart.md" rel="noopener noreferrer"&gt;documentation&lt;/a&gt;. The following is my deployment process.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# deploy cert-manager&lt;/span&gt;
&lt;span class="o"&gt;(&lt;/span&gt;MoeLove&lt;span class="o"&gt;)&lt;/span&gt; ➜ kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; https://github.com/cert-manager/cert-manager/releases/download/v1.11.0/cert-manager.yaml

&lt;span class="c"&gt;# deploy ARC&lt;/span&gt;
&lt;span class="o"&gt;(&lt;/span&gt;MoeLove&lt;span class="o"&gt;)&lt;/span&gt; ➜ helm repo add actions-runner-controller https://actions-runner-controller.github.io/actions-runner-controller
&lt;span class="o"&gt;(&lt;/span&gt;MoeLove&lt;span class="o"&gt;)&lt;/span&gt; ➜ helm upgrade &lt;span class="nt"&gt;--install&lt;/span&gt; &lt;span class="nt"&gt;--namespace&lt;/span&gt; actions-runner-system &lt;span class="nt"&gt;--create-namespace&lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;authSecret.create&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;authSecret.github_token&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"REPLACE_YOUR_TOKEN_HERE"&lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--wait&lt;/span&gt; actions-runner-controller actions-runner-controller/actions-runner-controller

&lt;span class="c"&gt;# create runner&lt;/span&gt;
&lt;span class="o"&gt;(&lt;/span&gt;MoeLove&lt;span class="o"&gt;)&lt;/span&gt; ➜ &lt;span class="nb"&gt;cat&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt; | kubectl apply -f -
apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
  name: moelove-runner
spec:
  replicas: 1
  template:
    spec:
      repository: MoeLove/monitoring
&lt;/span&gt;&lt;span class="no"&gt;EOF
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;After installation, the following results are achieved:&lt;/p&gt;

&lt;p&gt;&lt;iframe class="tweet-embed" id="tweet-1616251840429002755-27" src="https://platform.twitter.com/embed/Tweet.html?id=1616251840429002755"&gt;
&lt;/iframe&gt;

  // Detect dark theme
  var iframe = document.getElementById('tweet-1616251840429002755-27');
  if (document.body.className.includes('dark-theme')) {
    iframe.src = "https://platform.twitter.com/embed/Tweet.html?id=1616251840429002755&amp;amp;theme=dark"
  }



&lt;/p&gt;
&lt;h2&gt;
  
  
  Self-hosted vs GitHub-managed
&lt;/h2&gt;

&lt;p&gt;In the content above, I introduced how I used Meercode to measure the key indicators of CI metrics and estimate the cost of GitHub Actions. According to my actual low resource consumption and high time-consuming scenario, I chose the Self-hosted runner.&lt;/p&gt;

&lt;p&gt;So when is it more appropriate to choose a GitHub-managed runner? What are the benefits of GitHub-managed?&lt;/p&gt;

&lt;p&gt;The GitHub-managed runner has the following advantages:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Support for multiple operating systems&lt;/strong&gt;: In addition to providing Linux systems, GitHub-managed runner also supports macOS and Windows, but most cloud providers do not provide macOS environments. (I used to put some Mac minis as servers in the data center for specific scenarios)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;VM-level isolation&lt;/strong&gt;: According to the GitHub Actions documentation, when the GitHub Actions runner runs a job, it creates a VM to run all tasks, which brings certain security and isolation guarantees. If it is a Self-hosted runner when running through the binary, the task will share the host environment, and if it is running through ARC, it will bring isolation through the Pod. This will cause certain security issues.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Low Maintenance Costs&lt;/strong&gt;: In fact in any large system, maintenance costs are very expensive. If it is only for personal use, or only a few projects use the Self-hosted runner, the maintenance cost is relatively controllable. Once it gets big, it introduces a lot of complexity. The GitHub-managed runner is maintained by GitHub.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There are also two products that offer self-hosted runner services:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://actuated.dev?umt_source=blog.moelove.info" rel="noopener noreferrer"&gt;Actuated&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://cirun.io?umt_source=blog.moelove.info" rel="noopener noreferrer"&gt;cirun&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;They reduce the cost of runner maintenance and management and provide more secure isolation and support for Arm-based environments. cirun also provides GPU runner support.&lt;/p&gt;

&lt;p&gt;If you have the above requirements, you may also wish to consider these services.&lt;/p&gt;
&lt;h2&gt;
  
  
  Summarize
&lt;/h2&gt;

&lt;p&gt;In general, the following steps are required to reduce the cost of GitHub actions.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Visualization/Observability: Estimate costs using actual data.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Compare multiple vendors/solutions: Different vendors offer different pricing for different scenarios or products, and you can choose according to your actual situation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Security and maintenance costs also need to be considered.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you are interested in my articles, please subscribe to my Newsletter!&lt;/p&gt;


&lt;div class="crayons-card c-embed text-styles text-styles--secondary"&gt;
      &lt;div class="c-embed__cover"&gt;
        &lt;a href="https://blog.moelove.info/newsletter" class="c-link s:max-w-50 align-middle" rel="noopener noreferrer"&gt;
          &lt;img alt="" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblog.moelove.info%2Fapi%2Fog%2Fhome%3Fog%3DeyJ0aXRsZSI6IkNsb3VkQ3JhZnRBSSUyMHdpdGglMjBKaW50YW8iLCJkb21haW4iOiJibG9nLm1vZWxvdmUuaW5mbyIsImlzVGVhbSI6dHJ1ZSwibWV0YSI6Ikt1YmVybmV0ZXMlMkMlMjBEb2NrZXIlMkMlMjBjb250YWluZXIlMkMlMjBlQlBGIiwiYXJ0aWNsZXMiOnsidG90YWxEb2N1bWVudHMiOjR9fQ%3D%3D" height="630" class="m-0" width="1200"&gt;
        &lt;/a&gt;
      &lt;/div&gt;
    &lt;div class="c-embed__body"&gt;
      &lt;h2 class="fs-xl lh-tight"&gt;
        &lt;a href="https://blog.moelove.info/newsletter" rel="noopener noreferrer" class="c-link"&gt;
          Newsletter | CloudCraftAI with Jintao
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;p class="truncate-at-3"&gt;
          Subscribe to CloudCraftAI with Jintao's newsletter.
        &lt;/p&gt;
      &lt;div class="color-secondary fs-s flex items-center"&gt;
          &lt;img alt="favicon" class="c-embed__favicon m-0 mr-2 radius-0" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.hashnode.com%2Fres%2Fhashnode%2Fimage%2Fupload%2Fv1611242173172%2FAOX1gE2jc.png" width="32" height="32"&gt;
        blog.moelove.info
      &lt;/div&gt;
    &lt;/div&gt;
&lt;/div&gt;



</description>
      <category>authentication</category>
      <category>security</category>
      <category>monitoring</category>
      <category>discuss</category>
    </item>
    <item>
      <title>My Rust journey and how to learn Rust</title>
      <dc:creator>Jintao Zhang</dc:creator>
      <pubDate>Tue, 17 Jan 2023 13:08:16 +0000</pubDate>
      <link>https://dev.to/zhangjintao/my-rust-journey-and-how-to-learn-rust-2obj</link>
      <guid>https://dev.to/zhangjintao/my-rust-journey-and-how-to-learn-rust-2obj</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;I'll share my Rust journey, how I learned Rust and some free Rust learning resources.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Rust has become more and more popular. Through the &lt;a href="https://survey.stackoverflow.co/2022/#most-loved-dreaded-and-wanted-language-want" rel="noopener noreferrer"&gt;StackOverflow 2022 Developer Survey&lt;/a&gt;, we can see that many people are interested in Rust.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Rust is on its seventh year as the most loved language with 87% of developers saying they want to continue using it.&lt;/p&gt;

&lt;p&gt;Rust also ties with Python as the most wanted technology with TypeScript running a close second&lt;/p&gt;
&lt;/blockquote&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Most Wanted&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftiv2cf1cctuxjqfhd0qf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftiv2cf1cctuxjqfhd0qf.png" alt="2023-01-14 01-16-33屏幕截图.png" width="800" height="422"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Most Loved vs. Dreaded&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F76c8crgnws7jijmgrj69.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F76c8crgnws7jijmgrj69.png" alt="2023-01-14 01-16-07屏幕截图.png" width="800" height="380"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But Rust has a particular learning curve.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://camo.githubusercontent.com/1d24e64022fd725f1896890b3ce14c560f075dc1f80f0b0baae3ece8981c882a/68747470733a2f2f70617065722d6174746163686d656e74732e64726f70626f782e636f6d2f735f353445314239364546464546443239343536323930324443354239393731443335434436423635304243383744313230303341333041343635313737363230315f313538363531343237353631385f696d6167652e706e67" class="article-body-image-wrapper"&gt;&lt;img src="https://camo.githubusercontent.com/1d24e64022fd725f1896890b3ce14c560f075dc1f80f0b0baae3ece8981c882a/68747470733a2f2f70617065722d6174746163686d656e74732e64726f70626f782e636f6d2f735f353445314239364546464546443239343536323930324443354239393731443335434436423635304243383744313230303341333041343635313737363230315f313538363531343237353631385f696d6167652e706e67" alt="pic from Rust User Team Samara - &amp;amp;Meetup1" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This made me want to share my Rust journey, why I chose Rust, and how to learn Rust.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting connected with Rust
&lt;/h2&gt;

&lt;p&gt;I had heard about Rust when it was first released, and my impression was that it was a system programming language that could replace C/C++ and was safe enough. But I didn't learn and use it. (I've only used it to write Hello World!)&lt;/p&gt;

&lt;p&gt;Back in time to 5 years ago, I was leading the transformation of the company's infrastructure into a cloud-native stack.&lt;/p&gt;

&lt;p&gt;I need to construct a monitoring stack based entirely on Prometheus to replace a set of monitoring software in the company with more than 10 years of history. And some other monitoring software, such as Nagios, Zabbix, and Graphite.&lt;/p&gt;

&lt;p&gt;Yes, you read that right, we are using a lot of surveillance software. There are a few reasons for this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;A single software cannot meet all needs&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The team is scattered, and most of the time, new software is introduced just to meet specific needs, rather than to solve the problem&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Anyway, here are some historical reasons.&lt;/p&gt;

&lt;p&gt;And, from what I mentioned above, we have a set of self-developed monitoring software with a history of more than 10 years, as you can see, our infrastructure is slow to iterate.&lt;/p&gt;

&lt;p&gt;And because we have our physical data center, this also leads to many old machines in our servers that have not been updated. (This is one of the reasons why I used Rust later)&lt;/p&gt;

&lt;p&gt;I first replaced the monitoring stack in a newly launched small data center, with about 400 machines, and the effect was good. Using Prometheus to complete the monitoring of all the servers in this small data center and the various services running on them. There are also Dashboards created for them in Grafana, and alarm notifications created through Alertmanager.&lt;/p&gt;

&lt;p&gt;Later, I promoted these transformations in two data centers, and overall it was relatively smooth, including the monitoring of Kubernetes was also completed during this process.&lt;/p&gt;

&lt;p&gt;But when it was implemented in the last data center, I faced the biggest challenge.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/prometheus/node_exporter/" rel="noopener noreferrer"&gt;&lt;strong&gt;node_exporter&lt;/strong&gt;&lt;/a&gt; &lt;strong&gt;failed to start on some machines, and some machines crashed automatically after running for some time.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I started to investigate this issue. For the automatic crash issue, I temporarily fixed it by adding a restart script.&lt;/p&gt;

&lt;p&gt;I'm mainly concerned with why node_exporter won't start. I found that the operating system of this part of the machine is CentOS 5, and the kernel is 2.6.18.&lt;/p&gt;

&lt;p&gt;I found that there are already similar issues in the community: &lt;a href="https://github.com/prometheus/node_exporter/issues/691" rel="noopener noreferrer"&gt;https://github.com/prometheus/node_exporter/issues/691&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;At the same time, I also noticed that the Go documentation clearly stated that CentOS 5 is not supported, and a kernel of at least version 2.6.32 or above is required.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;(I forgot the minimum dependencies when I checked, but through the &lt;a href="https://web.archive.org/web/20170916192117/https://github.com/golang/go/wiki/MinimumRequirements" rel="noopener noreferrer"&gt;web archive&lt;/a&gt;, I see that the minimum kernel version required in 2017 is 2.6.23)&lt;/p&gt;

&lt;p&gt;After some searching, I also saw something like &lt;a href="https://dave.cheney.net/2013/06/18/how-to-install-go-1-1-on-centos-5" rel="noopener noreferrer"&gt;How to install Go 1.1 on CentOS 5.9&lt;/a&gt;, but at the same time, some known issues are mentioned in the article.&lt;/p&gt;

&lt;p&gt;So I'm not going to keep fighting it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;I want to re-implement one by myself&lt;/strong&gt;, which can also solve the above automatic crash problem.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;In the end, I used Rust to implement a tool similar to node_exporter and completed the upgrade and transformation of the monitoring system.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;This is where my journey started with Rust in production.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Next, let me introduce why I chose Rust.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why choose Rust
&lt;/h2&gt;

&lt;p&gt;I have introduced some background above. At that time, the easiest choice should be Python, which is simple enough and rich in ecology. At the same time, I also have many years of experience in Python development, I can quickly build the tools I need.&lt;/p&gt;

&lt;p&gt;The reasons for not choosing Python are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Not all of these machines have a Python environment, and the versions of Python are also different. I was asked not to modify the environment on these machines as much as possible;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Since I may make some modifications later, I think the subsequent distribution may not be convenient;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then I rethought my goal:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Can be compiled into binary executable files for easy distribution and deployment. I used Ansible for unified deployment.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So a more suitable option is C/C++/Rust.&lt;/p&gt;

&lt;p&gt;I have more experience in C development and a little experience in C++. For my first requirement, the above three languages can be easily met.&lt;/p&gt;

&lt;p&gt;When most people compare Rust and C/C++, they are comparing their performance and safety.&lt;/p&gt;

&lt;p&gt;And in my use case at the time, I don't think the results in the other two languages would be worse than in Rust, although these are also considerations. And since I was just starting to learn Rust at the time, it might be worse than my C implementation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;But I want more challenges, try something new, and in terms of Prometheus monitoring, the C/C++-related ecology is not very active. Another point I think Rust will have great development in the future.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;So in the end I chose Rust.&lt;/p&gt;

&lt;h2&gt;
  
  
  How I learned Rust
&lt;/h2&gt;

&lt;p&gt;Rust is not simple, and it's not quite the same as other languages, so some practices that work in other languages may not work in Rust.&lt;/p&gt;

&lt;p&gt;Since I have a specific problem that needs to be solved, I need to implement a  &lt;a href="https://github.com/prometheus/node_exporter/" rel="noopener noreferrer"&gt;node_exporter&lt;/a&gt;  to complete the transformation of the monitoring stack. So I learned Rust through the learning-by-doing mode.&lt;/p&gt;

&lt;p&gt;I first took a quick look at the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://doc.rust-lang.org/stable/book/" rel="noopener noreferrer"&gt;The Rust Programming Language&lt;/a&gt;: This book is very complete, I didn't read it completely at first. Instead, use it to understand the main concepts and some usages in Rust.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://doc.rust-lang.org/rust-by-example/" rel="noopener noreferrer"&gt;Rust By Example&lt;/a&gt;: There are many examples here, and you can also increase your familiarity with Rust by practicing these examples;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://doc.rust-lang.org/std/index.html" rel="noopener noreferrer"&gt;Rust std lib docs&lt;/a&gt;: Documentation of the standard library, a quick overview, understanding some keywords, modules, etc. But it is not necessary to read it in its entirety initially.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This way I quickly implemented a basic node_exporter version. Then continue to iterate and apply it to the production environment, and completed the construction of the Prometheus monitoring stack.&lt;/p&gt;

&lt;p&gt;Later, I continued to implement some small tools in Rust, learned its best practices, and learned some open-source projects implemented in Rust to increase my Rust experience.&lt;/p&gt;

&lt;h2&gt;
  
  
  Recommend some Rust learning resources
&lt;/h2&gt;

&lt;p&gt;There are many learning resources for Rust now. In addition to the ones I listed above, I recommend the following free content:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://learn.microsoft.com/en-us/training/paths/rust-first-steps/" rel="noopener noreferrer"&gt;Take your first steps with Rust - Training | Microsoft Learn&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://github.com/rust-lang/rustlings" rel="noopener noreferrer"&gt;rust-lang/rustlings: Small exercises to get you used to reading and writing Rust code!&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;videos:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=zF34dRivLOw&amp;amp;utm_source=blog.moelove.info&amp;amp;utm_medium=content" rel="noopener noreferrer"&gt;Rust Crash Course | Rustlang - YouTube&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=T_KrYLW4jw8&amp;amp;list=PLzMcBGfZo4-nyLTlSRBvo0zjSnCnqjHYQ&amp;amp;utm_source=blog.moelove.info&amp;amp;utm_medium=content" rel="noopener noreferrer"&gt;Rust Tutorial - YouTube&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://www.youtube.com/playlist?list=PLlrxD0HtieHjbTjrchBwOVks_sr8EVW1x&amp;amp;utm_source=blog.moelove.info&amp;amp;utm_medium=content" rel="noopener noreferrer"&gt;Rust for Beginners - YouTube&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Summarize
&lt;/h2&gt;

&lt;p&gt;This is how my Rust journey started, and it continues.&lt;/p&gt;

&lt;p&gt;Although I focus on Cloud Native and Kubernetes-related technologies, and now I write more Go language, I also still write some tools in Rust and use Rust in WebAssembly.&lt;/p&gt;

&lt;p&gt;In the future, I will also share relevant content. If you are interested in my articles, welcome to subscribe to my Newsletter!&lt;/p&gt;


&lt;div class="crayons-card c-embed text-styles text-styles--secondary"&gt;
      &lt;div class="c-embed__cover"&gt;
        &lt;a href="https://blog.moelove.info/newsletter" class="c-link s:max-w-50 align-middle" rel="noopener noreferrer"&gt;
          &lt;img alt="" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fblog.moelove.info%2Fapi%2Fog%2Fhome%3Fog%3DeyJ0aXRsZSI6IkNsb3VkQ3JhZnRBSSUyMHdpdGglMjBKaW50YW8iLCJkb21haW4iOiJibG9nLm1vZWxvdmUuaW5mbyIsImlzVGVhbSI6dHJ1ZSwibWV0YSI6Ikt1YmVybmV0ZXMlMkMlMjBEb2NrZXIlMkMlMjBjb250YWluZXIlMkMlMjBlQlBGIiwiYXJ0aWNsZXMiOnsidG90YWxEb2N1bWVudHMiOjR9fQ%3D%3D" height="630" class="m-0" width="1200"&gt;
        &lt;/a&gt;
      &lt;/div&gt;
    &lt;div class="c-embed__body"&gt;
      &lt;h2 class="fs-xl lh-tight"&gt;
        &lt;a href="https://blog.moelove.info/newsletter" rel="noopener noreferrer" class="c-link"&gt;
          Newsletter | CloudCraftAI with Jintao
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;p class="truncate-at-3"&gt;
          Subscribe to CloudCraftAI with Jintao's newsletter.
        &lt;/p&gt;
      &lt;div class="color-secondary fs-s flex items-center"&gt;
          &lt;img alt="favicon" class="c-embed__favicon m-0 mr-2 radius-0" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.hashnode.com%2Fres%2Fhashnode%2Fimage%2Fupload%2Fv1611242173172%2FAOX1gE2jc.png" width="32" height="32"&gt;
        blog.moelove.info
      &lt;/div&gt;
    &lt;/div&gt;
&lt;/div&gt;


</description>
      <category>github</category>
      <category>gitlab</category>
      <category>cli</category>
      <category>mcp</category>
    </item>
    <item>
      <title>Opportunities and Challenges of Technological Evolution in Cloud Native</title>
      <dc:creator>Jintao Zhang</dc:creator>
      <pubDate>Thu, 15 Dec 2022 17:06:05 +0000</pubDate>
      <link>https://dev.to/zhangjintao/opportunities-and-challenges-of-technological-evolution-in-cloud-native-454j</link>
      <guid>https://dev.to/zhangjintao/opportunities-and-challenges-of-technological-evolution-in-cloud-native-454j</guid>
      <description>&lt;p&gt;Nowadays, Cloud Native is becoming increasingly popular, and the CNCF defines Cloud Native as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Based on a modern and dynamic environment, aka cloud environment.&lt;/li&gt;
&lt;li&gt;With containerization as the fundamental technology, including Service Mesh, immutable infrastructure, declarative API, etc.&lt;/li&gt;
&lt;li&gt;Key features include autoscaling, manageability, observability, automation, frequent change, etc.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;According to the CNCF 2021 survey, there are a very significant number (over 62,000) of contributors in the Kubernetes community. With the current trend of technology, more and more companies are investing more cost into Cloud Native and joining the track early for active cloud deployment. Why are companies embracing Cloud Native while developing, and what does Cloud Native mean for them?&lt;/p&gt;

&lt;h2&gt;
  
  
  Technical Advantages of Cloud Native
&lt;/h2&gt;

&lt;p&gt;The popularity of Cloud Native comes from its advantages at the technical level. There are two main aspects of Cloud Native technology, including containerization led by Docker, and container orchestration led by Kubernetes.&lt;/p&gt;

&lt;p&gt;Docker introduced container images to the technology world, making container images a standardized delivery unit. In fact, before Docker, containerization technology already existed. Let's talk about a more recent technology, LXC (&lt;a href="https://linuxcontainers.org/" rel="noopener noreferrer"&gt;Linux Containers&lt;/a&gt;) in 2008. Compared to Docker, LXC is less popular since Docker provides container images, which can be more standardized and more convenient to migrate. Also, Docker created the DockerHub public service, which has become the world's largest container image repository. In addition, containerization technology can also achieve a certain degree of resource isolation, including not only CPU, memory, and other resources isolation, but also network stack isolation, which makes it easier to deploy multiple copies of applications on the same machine.&lt;/p&gt;

&lt;p&gt;Kubernetes became popular due to the booming of Docker. The container orchestration technology, led by Kubernetes, provides several important capabilities, such as fault self-healing, resource scheduling, and service orchestration. Kubernetes has a built-in DNS-based service discovery mechanism, and thanks to its scheduling architecture, it can be scaled very quickly to achieve service orchestration.&lt;/p&gt;

&lt;p&gt;Now more and more companies are actively embracing Kubernetes and transforming their applications to embark on Kubernetes deployment. And Cloud Native we are talking about is actually based on the premise of Kubernetes, the cornerstone of Cloud Native technology.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyynaeviof9oo0dlho0cb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyynaeviof9oo0dlho0cb.png" alt="img1.PNG" width="800" height="466"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Containerization Advantages
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Standardized Delivery&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Container images have now become a standardized delivery unit. By containerization technology, users can directly complete the delivery through a container image instead of binary or source code. Relying on the packaging mechanism of the container image, you can use the same image to start a service and produce the same behavior in any container runtime.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Portable and Light-weight, Cost-saving&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Containerization technology achieves certain isolation by Linux kernel's capabilities, which in turn makes it easier to migrate. Moreover, containerization technology can directly run applications, which is lighter in technical implementation compared to virtualization technology, without the need for OS in the virtual machine.&lt;br&gt;
 All applications can share the kernel, which saves cost. And the larger the application, the greater the cost savings.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Convenience of resource management&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;When starting a container, you can set the CPU, memory, or disk IO properties that can be used for the container service, which allows for better planning and deployment of resources when starting application instances through containers.&lt;/p&gt;

&lt;h3&gt;
  
  
  Container Orchestration Advantages
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Simplify the Workflow&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In Kubernetes, application deployment is easier to manage than in Docker, since Kubernetes uses declarative configuration. For example, a user can simply declare in a configuration file what container image the application will use and what service ports are exposed without the need for additional management. The operations corresponding to the declarative configuration greatly simplify the workflow.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Improve Efficiency and Save Costs&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Another advantageous feature of Kubernetes is failover. When a node in Kubernetes crashes, Kubernetes automatically schedules the applications on it to other normal nodes and gets them up and running. The entire recovery process does not require human intervention and operation, so it not only improves operation and maintenance efficiency at the operational level but also saves time and cost.&lt;/p&gt;

&lt;p&gt;With the rise of Docker and Kubernetes, you will see that their emergence has brought great innovation and opportunity to application delivery. Container images, as standard delivery units, shorten the delivery process and make it easier to integrate with CI/CD systems.&lt;/p&gt;

&lt;p&gt;Considering that application delivery is becoming faster, how is that application architecture following the Cloud Native trend?&lt;/p&gt;

&lt;h2&gt;
  
  
  Application Architecture Evolution: from Monoliths, Microservice to Service Mesh
&lt;/h2&gt;

&lt;p&gt;The starting point of application architecture evolution is still from monolithic architecture. As the size and requirements of applications increased, the monolithic architecture no longer met the needs of collaborative team development, thus distributed architectures were gradually introduced.&lt;/p&gt;

&lt;p&gt;Among the distributed architectures, the most popular one is the microservice architecture. Microservice architecture can split services into multiple modules, which communicate with each other, complete service registration and discovery, and achieve common capabilities such as flow limitation and circuit breaking.&lt;/p&gt;

&lt;p&gt;In addition, there are various patterns included in a microservice architecture. For example, the per-service database pattern, which represents each microservice with an individual database, is a pattern that avoids database-level impact on the application but may introduce more database instances.&lt;/p&gt;

&lt;p&gt;Another one is the API Gateway pattern, which receives the entrance traffic of the cluster or the whole microservice architecture through a gateway and completes the traffic distribution through APIs. This is one of the most used patterns, and gateway products like Spring Cloud Gateway or Apache APISIX can be applied.&lt;/p&gt;

&lt;p&gt;The popular architectures are gradually extending to Cloud Native architectures. Can a microservice architecture under Cloud Native simply build the original microservice as a container image and migrate it directly to Kubernetes?&lt;/p&gt;

&lt;p&gt;In theory, it seems possible, but in practice there are some challenges. In a Cloud Native microservice architecture, these components need to run not just in containers, but also include other aspects such as service registration, discovery, and configuration.&lt;/p&gt;

&lt;p&gt;The migration process also involves business-level transformation and adaptation, requiring the migration of common logic such as authentication, authorization, and observability-related capabilities (logging, monitoring, etc.) to K8s. Therefore, the migration from the original physical machine deployment to the K8s platform is much more complex than it is.&lt;/p&gt;

&lt;p&gt;In this case, we can use the Sidecar model to abstract and simplify the above scenario.&lt;/p&gt;

&lt;p&gt;Typically, the Sidecar model comes in the form of a Sidecar Proxy, which evolves from the left side of the diagram below to the right side by sinking some generic capabilities (such as authentication, authorization, security, etc.) into Sidecar. As you can see from the diagram, this model has been adapted from requiring multiple components to be maintained to requiring only two things (application + Sidecar) to be maintained. At the same time, the Sidecar model itself contains some common components, so it does not need to be maintained by the business side itself, thus easily solving the problem of microservice communication.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fidm0dtugo8gwsbhm8tuz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fidm0dtugo8gwsbhm8tuz.png" alt="img2.PNG" width="800" height="493"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To avoid the complex scenes of separate configuration and repeated wheel building when introducing a Sidecar for each microservice, the process can be implemented by introducing a control plane or by control plane injection, which gradually forms current Service Mesh.&lt;/p&gt;

&lt;p&gt;Service Mesh usually requires two components, i.e., control plane + data plane. The control plane completes the distribution of configuration and the execution of the related logic, such as Istio, which is currently the most popular. On the data plane, you can choose an API gateway like Apache APISIX for traffic forwarding and service communication. Thanks to the high performance and scalability of APISIX, it is also possible to perform some customization requirements and custom logic. The following shows the architecture of the Service Mesh solution with Istio+APISIX.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F58rfgjstcis5puo94ebd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F58rfgjstcis5puo94ebd.png" alt="img3.PNG" width="800" height="552"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The advantage of this solution is that when you want to migrate from the previous microservice architecture to a Cloud Native architecture, you can avoid massive changes on the business side by using a Service Mesh solution directly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Technical Challenges of Cloud Native
&lt;/h2&gt;

&lt;p&gt;The previous article mentioned some of the advantages of the current Cloud Native trend in terms of technical aspects. However, every coin has two sides. Although some fresh elements and opportunities can be brought, challenges will emerge due to the participation of certain technologies.&lt;/p&gt;

&lt;h3&gt;
  
  
  Problems Caused by Containerization and K8s
&lt;/h3&gt;

&lt;p&gt;In the beginning part of the article, we mentioned that containerization technology uses a shared kernel, and the shared kernel brings lightness but creates a lack of isolation. If container escape occurs, the corresponding host may be attacked. Therefore, to meet these security challenges, technologies such as secure containers have been introduced.&lt;/p&gt;

&lt;p&gt;In addition, although container images provide a standardized delivery method, they are prone to be attacked, such as supply chain attacks.&lt;/p&gt;

&lt;p&gt;Similarly, the introduction of K8s has also brought about challenges in component security. The increase in components has led to a rise in the attack surface, as well as additional vulnerabilities related to the underlying components and dependency levels. At the infrastructure level, migrating from traditional physical or virtual machines to K8s involves infrastructure transformation costs and more labor costs to perform cluster data backups, periodic upgrades, and certificate renewals.&lt;/p&gt;

&lt;p&gt;Also, in the Kubernetes architecture, the apiserver is the core component of the cluster and needs to handle all the inside and outside traffic. Therefore, in order to avoid border security issues, how to protect the apiserver also becomes a key question. For example, we can use Apache APISIX to protect it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Security
&lt;/h3&gt;

&lt;p&gt;The use of new technologies requires additional attention at the security level:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;At the network security level&lt;/strong&gt;, fine-grained control of traffic can be implemented by Network Policy, or other connection encryption methods like mTLS to form a zero-trust network.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;At the data security level&lt;/strong&gt;, K8s provides the secret resource for handling confidential data, but actually, it is not secure. The contents of the secret resource are encoded in Base64, which means you can access the contents through Base64 decoding, especially if they are placed in etcd, which can be read directly if you have access to etcd.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;At the level of permission security&lt;/strong&gt;, there is also a situation where RBAC settings are not reasonable, which leads to an attacker using the relevant Token to communicate with the apiserver to achieve the purpose of the attack. This kind of permission setting is mostly seen in the controller and operator scenarios.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs2o3j402cie9kf7138a3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs2o3j402cie9kf7138a3.png" alt="img4.png" width="800" height="101"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Observability
&lt;/h3&gt;

&lt;p&gt;Most of the Cloud Native scenarios involve some observability-related operations such as logging, monitoring, etc.&lt;/p&gt;

&lt;p&gt;In K8s, if you want to collect logs in a variety of ways, you need to collect them directly on each K8s node through aggregation. If logs were collected in this way, the application would need to be exported to standard output or standard errors.&lt;/p&gt;

&lt;p&gt;However, if the business does not make relevant changes and still chooses to write all the application logs to a file in the container, it means that a Sidecar is needed for log collection in each instance, which makes the deployment architecture extremely complex.&lt;/p&gt;

&lt;p&gt;Back to the architecture governance level, the selection of monitoring solutions in the Cloud Native environment also poses some challenges. Once the solution selection is wrong, the subsequent cost of use is very high, and the loss can be huge if the direction is wrong.&lt;/p&gt;

&lt;p&gt;Also, there are capacity issues involved at the monitoring level. While deploying an application in K8s, you can simply configure its rate limiting to limit the resource details the application can use. However, in a K8s environment, it is still rather easy to over-sell resources, over-utilize resources, and overflow memory due to these conditions.&lt;/p&gt;

&lt;p&gt;In addition, another situation in a K8s cluster where the entire cluster or node runs out of resources will lead to resource eviction, which means resources already running on a node are evicted to other nodes. If a cluster's resources are tight, a node storm can easily cause the entire cluster to crash.&lt;/p&gt;

&lt;h3&gt;
  
  
  Application Evolution and Multi-cluster Pattern
&lt;/h3&gt;

&lt;p&gt;At the application architecture evolution level, the core issue is service discovery.&lt;/p&gt;

&lt;p&gt;K8s provides a DNS-based service discovery mechanism by default, but if the business includes the coexistence of cloud business and stock business, it will be more complicated to use a DNS service discovery mechanism to deal with the situation.&lt;/p&gt;

&lt;p&gt;Meanwhile, if enterprises choose Cloud Native technology, with the expansion of business scale, they will gradually go to consider the direction of multi-node processing, which will then involve multi-cluster issues.&lt;/p&gt;

&lt;p&gt;For example, we want to provide customers with a higher availability model through multiple clusters, and this time it will involve the orchestration of services between multiple clusters, multi-cluster load distribution and synchronization configuration, and how to handle and deploy strategies for clusters in multi-cloud and hybrid cloud scenarios. These are some of the challenges that will be faced.&lt;/p&gt;

&lt;h2&gt;
  
  
  How APISIX Enables Digital Transformation
&lt;/h2&gt;

&lt;p&gt;Apache APISIX is a Cloud Native API gateway under the Apache Software Foundation, which is dynamic, real-time, and high-performance, providing rich traffic management features such as load balancing, dynamic upstream, canary release, circuit breaking, authentication, observability, etc. You can use Apache APISIX to handle traditional north-south traffic, as well as east-west traffic between services.&lt;/p&gt;

&lt;p&gt;Currently, based on the architectural evolution and application changes described above, APISIX-based Ingress controller and Service Mesh solutions have also been derived in Apache APISIX to help enterprises to better carry out digital transformation.&lt;/p&gt;

&lt;h3&gt;
  
  
  APISIX Ingress Solution
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://github.com/apache/apisix-ingress-controller" rel="noopener noreferrer"&gt;Apache APISIX Ingress Controller&lt;/a&gt; is a Kubernetes Ingress Controller implementation that serves primarily as a traffic gateway for handling north-south Kubernetes traffic.&lt;/p&gt;

&lt;p&gt;The APISIX Ingress Controller architecture is similar to APISIX in that it is a separate architecture for the control plane and the data plane. In this case, APISIX is used as the data plane for the actual traffic processing.&lt;/p&gt;

&lt;p&gt;Currently, APISIX Ingress Controller supports the following three configuration methods and is compatible with all APISIX plugins out of the box:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Support for Ingress resources native to K8s. This approach allows APISIX Ingress Controller to have a higher level of adaptability. By far, APISIX Ingress Controller is the most supported version of any open-source and influential Ingress controller product.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Support for using custom resources. The current custom resources of APISIX Ingress Controller are a set of CRD specifications designed according to APISIX semantics. Using custom resources makes it easy to integrate with APISIX and is more native.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Support for Gateway API. As the next generation of the Ingress standard, APISIX Ingress Controller has started to support Gateway API (Beta stage). As the Gateway API evolves, it is likely to become a built-in resource for K8s directly.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;APISIX Ingress Controller has the following advantages over Ingress NGINX:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Architectural separation&lt;/strong&gt;. In APISIX Ingress, the architecture of the data plane and control plane are separated. When the traffic processing pressure is high and you want to expand the capacity, you can simply do the expansion of the data plane, which allows more data planes to be served externally without the need to make any adjustments to the control plane.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;High scalability and support for custom plugins&lt;/strong&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;As the choice of data plane, with high performance and fully dynamic features.&lt;/strong&gt; Thanks to the fully dynamic feature of APISIX, it is possible to protect business traffic as much as possible with the use of APISIX Ingress.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Currently, APISIX Ingress Controller is used by many companies worldwide, such as China Mobile Cloud Open Platform (an open API and cloud IDE product), Upyun, and Copernicus (part of Europe's Eyes on Earth).&lt;/p&gt;

&lt;p&gt;APISIX Ingress Controller is still in continuous iteration, and we plan to improve more functions in the following ways:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Complete support for the Gateway API to enable more scenario configurations.&lt;/li&gt;
&lt;li&gt;Support external service proxy.&lt;/li&gt;
&lt;li&gt;Native support for multiple registries to make APISIX Ingress Controller more versatile.&lt;/li&gt;
&lt;li&gt;Architectural updates to create a new architectural model;&lt;/li&gt;
&lt;li&gt;Integrate with Argo CD/Flux and other GitOps tools to create a rich ecosystem.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If you are interested in the APISIX Ingress solution, please feel free to follow &lt;a href="https://github.com/apache/apisix-ingress-controller" rel="noopener noreferrer"&gt;the community updates&lt;/a&gt; for product iterations and community trends.&lt;/p&gt;

&lt;h3&gt;
  
  
  APISIX Service Mesh Solution
&lt;/h3&gt;

&lt;p&gt;Currently, in addition to the API gateway and Ingress solution, the APISIX-based Service Mesh solution is also in active iteration.&lt;/p&gt;

&lt;p&gt;The APISIX-based Service Mesh solution consists of two main components, namely the control plane and the data plane. Istio was chosen for the control plane since it is an industry leader with an active community and is supported by multiple vendors. APISIX was chosen to replace Envoy on the data side, allowing APISIX's high performance and scalability to come into play.&lt;/p&gt;

&lt;p&gt;APISIX's Service Mesh is still being actively pursued, with subsequent iterations planned in the following directions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Performing eBPF acceleration to improve overall effectiveness.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Performing plugin capability integration to allow better use of APISIX Ingress capabilities within the Service Mesh architecture.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Creating a seamless migration tool to provide easier tools and simplify the process for users.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In general, the evolution of architecture and technology in the Cloud Native era brings us both opportunities and challenges. Apache APISIX as a Cloud Native gateway has been committed to more technical adaptations and integrations for the Cloud Native trend. Various solutions based on APISIX have also started to help enterprise users to carry out digital transformation and help enterprises to transition to the Cloud Native track more smoothly.&lt;/p&gt;

</description>
      <category>cloudnative</category>
      <category>kubernetes</category>
      <category>servicemesh</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Thoroughly understand Events in Kubernetes</title>
      <dc:creator>Jintao Zhang</dc:creator>
      <pubDate>Tue, 12 Apr 2022 15:42:35 +0000</pubDate>
      <link>https://dev.to/zhangjintao/thoroughly-understand-events-in-kubernetes-29lj</link>
      <guid>https://dev.to/zhangjintao/thoroughly-understand-events-in-kubernetes-29lj</guid>
      <description>&lt;p&gt;Hi everyone, this is Jintao Zhang.&lt;/p&gt;

&lt;p&gt;Before I wrote an article &lt;a href="https://segmentfault.com/a/1190000040238160/en" rel="noopener noreferrer"&gt;"A More Elegant Kubernetes Cluster Event Measurement Scheme"&lt;/a&gt; , using Jaeger to use tracing to collect events in the Kubernetes cluster and display it. The final effect is as follows:&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fren4r8lk795uy13b9a6x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fren4r8lk795uy13b9a6x.png" alt="using Jeager collect events" width="732" height="328"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When I wrote that article, I set up a flag to introduce the principles in detail. I have been pigeoning for a long time. Now it's the end of the year and it's time to send it out.&lt;/p&gt;
&lt;h2&gt;
  
  
  Eents overview
&lt;/h2&gt;

&lt;p&gt;Let's first make a simple example to see what events in a Kubernetes cluster are.&lt;/p&gt;

&lt;p&gt;Create a new namespace called &lt;code&gt;moelove&lt;/code&gt; , and then create a deployment called &lt;code&gt;redis&lt;/code&gt; in it. Next, look at all events in this namespace.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;(&lt;/span&gt;MoeLove&lt;span class="o"&gt;)&lt;/span&gt; ➜ kubectl create ns moelove
namespace/moelove created
&lt;span class="o"&gt;(&lt;/span&gt;MoeLove&lt;span class="o"&gt;)&lt;/span&gt; ➜ kubectl &lt;span class="nt"&gt;-n&lt;/span&gt; moelove create deployment redis &lt;span class="nt"&gt;--image&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;ghcr.io/moelove/redis:alpine 
deployment.apps/redis created
&lt;span class="o"&gt;(&lt;/span&gt;MoeLove&lt;span class="o"&gt;)&lt;/span&gt; ➜ kubectl &lt;span class="nt"&gt;-n&lt;/span&gt; moelove get deploy
NAME    READY   UP-TO-DATE   AVAILABLE   AGE
redis   1/1     1            1           11s
&lt;span class="o"&gt;(&lt;/span&gt;MoeLove&lt;span class="o"&gt;)&lt;/span&gt; ➜ kubectl &lt;span class="nt"&gt;-n&lt;/span&gt; moelove get events
LAST SEEN   TYPE     REASON              OBJECT                        MESSAGE
21s         Normal   Scheduled           pod/redis-687967dbc5-27vmr    Successfully assigned moelove/redis-687967dbc5-27vmr to kind-worker3
21s         Normal   Pulling             pod/redis-687967dbc5-27vmr    Pulling image &lt;span class="s2"&gt;"ghcr.io/moelove/redis:alpine"&lt;/span&gt;
15s         Normal   Pulled              pod/redis-687967dbc5-27vmr    Successfully pulled image &lt;span class="s2"&gt;"ghcr.io/moelove/redis:alpine"&lt;/span&gt; &lt;span class="k"&gt;in &lt;/span&gt;6.814310968s
14s         Normal   Created             pod/redis-687967dbc5-27vmr    Created container redis
14s         Normal   Started             pod/redis-687967dbc5-27vmr    Started container redis
22s         Normal   SuccessfulCreate    replicaset/redis-687967dbc5   Created pod: redis-687967dbc5-27vmr
22s         Normal   ScalingReplicaSet   deployment/redis              Scaled up replica &lt;span class="nb"&gt;set &lt;/span&gt;redis-687967dbc5 to 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But we will find that by default &lt;code&gt;kubectl get events&lt;/code&gt; is not arranged in the order in which the events occur, so we often need to add the &lt;code&gt;--sort-by='{.metadata.creationTimestamp}'&lt;/code&gt; parameter to it so that its output can be arranged in time.&lt;/p&gt;

&lt;p&gt;This is why Kubernetes adds &lt;code&gt;kubectl alpha events&lt;/code&gt; command in v1.23 version. I have made a detailed introduction in the previous article, so I won't expand it here.&lt;/p&gt;

&lt;p&gt;After sorting by time, you can see the following results:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;(&lt;/span&gt;MoeLove&lt;span class="o"&gt;)&lt;/span&gt; ➜ kubectl &lt;span class="nt"&gt;-n&lt;/span&gt; moelove get events &lt;span class="nt"&gt;--sort-by&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'{.metadata.creationTimestamp}'&lt;/span&gt;
LAST SEEN   TYPE     REASON              OBJECT                        MESSAGE
2m12s       Normal   Scheduled           pod/redis-687967dbc5-27vmr    Successfully assigned moelove/redis-687967dbc5-27vmr to kind-worker3
2m13s       Normal   SuccessfulCreate    replicaset/redis-687967dbc5   Created pod: redis-687967dbc5-27vmr
2m13s       Normal   ScalingReplicaSet   deployment/redis              Scaled up replica &lt;span class="nb"&gt;set &lt;/span&gt;redis-687967dbc5 to 1
2m12s       Normal   Pulling             pod/redis-687967dbc5-27vmr    Pulling image &lt;span class="s2"&gt;"ghcr.io/moelove/redis:alpine"&lt;/span&gt;
2m6s        Normal   Pulled              pod/redis-687967dbc5-27vmr    Successfully pulled image &lt;span class="s2"&gt;"ghcr.io/moelove/redis:alpine"&lt;/span&gt; &lt;span class="k"&gt;in &lt;/span&gt;6.814310968s
2m5s        Normal   Created             pod/redis-687967dbc5-27vmr    Created container redis
2m5s        Normal   Started             pod/redis-687967dbc5-27vmr    Started container redis
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Through the above operations, we can find that events is actually a resource in the Kubernetes cluster. When the resource status in the Kubernetes cluster changes, new events can be generated.&lt;/p&gt;

&lt;h2&gt;
  
  
  In-depth Events
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Single Event object
&lt;/h3&gt;

&lt;p&gt;Since events is a resource in a Kubernetes cluster, its metadata.name should contain its name under normal circumstances for individual operations. So we can use the following command to output its name:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;(&lt;/span&gt;MoeLove&lt;span class="o"&gt;)&lt;/span&gt; ➜ kubectl &lt;span class="nt"&gt;-n&lt;/span&gt; moelove get events &lt;span class="nt"&gt;--sort-by&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'{.metadata.creationTimestamp}'&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="nv"&gt;jsonpath&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'{range .items[*]}{.metadata.name}{"\n"}{end}'&lt;/span&gt;
redis-687967dbc5-27vmr.16c4fb7bde8c69d2
redis-687967dbc5.16c4fb7bde6b54c4
redis.16c4fb7bde1bf769
redis-687967dbc5-27vmr.16c4fb7bf8a0ab35
redis-687967dbc5-27vmr.16c4fb7d8ecaeff8
redis-687967dbc5-27vmr.16c4fb7d99709da9
redis-687967dbc5-27vmr.16c4fb7d9be30c06
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Select any one of the event records and output it in YAML format for viewing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;(&lt;/span&gt;MoeLove&lt;span class="o"&gt;)&lt;/span&gt; ➜ kubectl &lt;span class="nt"&gt;-n&lt;/span&gt; moelove get events redis-687967dbc5-27vmr.16c4fb7bde8c69d2 &lt;span class="nt"&gt;-o&lt;/span&gt; yaml
action: Binding
apiVersion: v1
eventTime: &lt;span class="s2"&gt;"2021-12-28T19:31:13.702987Z"&lt;/span&gt;
firstTimestamp: null
involvedObject:
  apiVersion: v1
  kind: Pod
  name: redis-687967dbc5-27vmr
  namespace: moelove
  resourceVersion: &lt;span class="s2"&gt;"330230"&lt;/span&gt;
  uid: 71b97182-5593-47b2-88cc-b3f59618c7aa
kind: Event
lastTimestamp: null
message: Successfully assigned moelove/redis-687967dbc5-27vmr to kind-worker3
metadata:
  creationTimestamp: &lt;span class="s2"&gt;"2021-12-28T19:31:13Z"&lt;/span&gt;
  name: redis-687967dbc5-27vmr.16c4fb7bde8c69d2
  namespace: moelove
  resourceVersion: &lt;span class="s2"&gt;"330235"&lt;/span&gt;
  uid: e5c03126-33b9-4559-9585-5e82adcd96b0
reason: Scheduled
reportingComponent: default-scheduler
reportingInstance: default-scheduler-kind-control-plane
&lt;span class="nb"&gt;source&lt;/span&gt;: &lt;span class="o"&gt;{}&lt;/span&gt;
&lt;span class="nb"&gt;type&lt;/span&gt;: Normal
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can see that it contains a lot of information, we will not expand it here. Let's look at another example.&lt;/p&gt;

&lt;h3&gt;
  
  
  Events in &lt;code&gt;kubectl describe&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;describe on the Deployment object and the Pod object respectively, and the following results can be obtained (the intermediate output is omitted):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Operations on Deployment
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;(&lt;/span&gt;MoeLove&lt;span class="o"&gt;)&lt;/span&gt; ➜ kubectl &lt;span class="nt"&gt;-n&lt;/span&gt; moelove describe deploy/redis                
Name:                   redis
Namespace:              moelove
...
Events:
  Type    Reason             Age   From                   Message
  &lt;span class="nt"&gt;----&lt;/span&gt;    &lt;span class="nt"&gt;------&lt;/span&gt;             &lt;span class="nt"&gt;----&lt;/span&gt;  &lt;span class="nt"&gt;----&lt;/span&gt;                   &lt;span class="nt"&gt;-------&lt;/span&gt;
  Normal  ScalingReplicaSet  15m   deployment-controller  Scaled up replica &lt;span class="nb"&gt;set &lt;/span&gt;redis-687967dbc5 to 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Operate on Pod
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;(&lt;/span&gt;MoeLove&lt;span class="o"&gt;)&lt;/span&gt; ➜ kubectl &lt;span class="nt"&gt;-n&lt;/span&gt; moelove describe pods redis-687967dbc5-27vmr
Name:         redis-687967dbc5-27vmr                                                                 
Namespace:    moelove
Priority:     0
Events:
  Type    Reason     Age   From               Message
  &lt;span class="nt"&gt;----&lt;/span&gt;    &lt;span class="nt"&gt;------&lt;/span&gt;     &lt;span class="nt"&gt;----&lt;/span&gt;  &lt;span class="nt"&gt;----&lt;/span&gt;               &lt;span class="nt"&gt;-------&lt;/span&gt;
  Normal  Scheduled  18m   default-scheduler  Successfully assigned moelove/redis-687967dbc5-27vmr to kind-worker3
  Normal  Pulling    18m   kubelet            Pulling image &lt;span class="s2"&gt;"ghcr.io/moelove/redis:alpine"&lt;/span&gt;
  Normal  Pulled     17m   kubelet            Successfully pulled image &lt;span class="s2"&gt;"ghcr.io/moelove/redis:alpine"&lt;/span&gt; &lt;span class="k"&gt;in &lt;/span&gt;6.814310968s
  Normal  Created    17m   kubelet            Created container redis
  Normal  Started    17m   kubelet            Started container redis

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can find that when describes different resource objects, the contents of events that can be seen are directly related to itself. When you describe Deployment, you cannot see Pod-related Events.&lt;/p&gt;

&lt;p&gt;This shows that, Event object that contains information about the resource objects it describes , they are directly linked.&lt;/p&gt;

&lt;p&gt;Combining the single Event object we saw earlier, we found involvedObject of the resource object associated with the Event.&lt;/p&gt;

&lt;h2&gt;
  
  
  Learn more about Events
&lt;/h2&gt;

&lt;p&gt;Let's take a look at the following example, creating a Deployment, but using a non-existing image:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;(&lt;/span&gt;MoeLove&lt;span class="o"&gt;)&lt;/span&gt; ➜ kubectl &lt;span class="nt"&gt;-n&lt;/span&gt; moelove create deployment non-exist &lt;span class="nt"&gt;--image&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;ghcr.io/moelove/non-exist
deployment.apps/non-exist created
&lt;span class="o"&gt;(&lt;/span&gt;MoeLove&lt;span class="o"&gt;)&lt;/span&gt; ➜ kubectl &lt;span class="nt"&gt;-n&lt;/span&gt; moelove get pods
NAME                        READY   STATUS         RESTARTS   AGE
non-exist-d9ddbdd84-tnrhd   0/1     ErrImagePull   0          11s
redis-687967dbc5-27vmr      1/1     Running        0          26m
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can see that the current Pod is in a state of ErrImagePull View the events in the current namespace (I omitted the record of deploy/redis before)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;(&lt;/span&gt;MoeLove&lt;span class="o"&gt;)&lt;/span&gt; ➜ kubectl &lt;span class="nt"&gt;-n&lt;/span&gt; moelove get events &lt;span class="nt"&gt;--sort-by&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'{.metadata.creationTimestamp}'&lt;/span&gt;                                                           
LAST SEEN   TYPE      REASON              OBJECT                           MESSAGE
35s         Normal    SuccessfulCreate    replicaset/non-exist-d9ddbdd84   Created pod: non-exist-d9ddbdd84-tnrhd
35s         Normal    ScalingReplicaSet   deployment/non-exist             Scaled up replica &lt;span class="nb"&gt;set &lt;/span&gt;non-exist-d9ddbdd84 to 1
35s         Normal    Scheduled           pod/non-exist-d9ddbdd84-tnrhd    Successfully assigned moelove/non-exist-d9ddbdd84-tnrhd to kind-worker3
17s         Warning   Failed              pod/non-exist-d9ddbdd84-tnrhd    Error: ErrImagePull
17s         Warning   Failed              pod/non-exist-d9ddbdd84-tnrhd    Failed to pull image &lt;span class="s2"&gt;"ghcr.io/moelove/non-exist"&lt;/span&gt;: rpc error: code &lt;span class="o"&gt;=&lt;/span&gt; Unknown desc &lt;span class="o"&gt;=&lt;/span&gt; failed to pull and unpack image &lt;span class="s2"&gt;"ghcr.io/moelove/non-exist:latest"&lt;/span&gt;: failed to resolve reference &lt;span class="s2"&gt;"ghcr.io/moelove/non-exist:latest"&lt;/span&gt;: failed to authorize: failed to fetch anonymous token: unexpected status: 403 Forbidden
18s         Normal    Pulling             pod/non-exist-d9ddbdd84-tnrhd    Pulling image &lt;span class="s2"&gt;"ghcr.io/moelove/non-exist"&lt;/span&gt;
4s          Warning   Failed              pod/non-exist-d9ddbdd84-tnrhd    Error: ImagePullBackOff
4s          Normal    BackOff             pod/non-exist-d9ddbdd84-tnrhd    Back-off pulling image &lt;span class="s2"&gt;"ghcr.io/moelove/non-exist"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;describe operation on this Pod:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;(&lt;/span&gt;MoeLove&lt;span class="o"&gt;)&lt;/span&gt; ➜ kubectl &lt;span class="nt"&gt;-n&lt;/span&gt; moelove describe pods non-exist-d9ddbdd84-tnrhd
...
Events:
  Type     Reason     Age                    From               Message
  &lt;span class="nt"&gt;----&lt;/span&gt;     &lt;span class="nt"&gt;------&lt;/span&gt;     &lt;span class="nt"&gt;----&lt;/span&gt;                   &lt;span class="nt"&gt;----&lt;/span&gt;               &lt;span class="nt"&gt;-------&lt;/span&gt;
  Normal   Scheduled  4m                     default-scheduler  Successfully assigned moelove/non-exist-d9ddbdd84-tnrhd to kind-worker3
  Normal   Pulling    2m22s &lt;span class="o"&gt;(&lt;/span&gt;x4 over 3m59s&lt;span class="o"&gt;)&lt;/span&gt;  kubelet            Pulling image &lt;span class="s2"&gt;"ghcr.io/moelove/non-exist"&lt;/span&gt;
  Warning  Failed     2m21s &lt;span class="o"&gt;(&lt;/span&gt;x4 over 3m59s&lt;span class="o"&gt;)&lt;/span&gt;  kubelet            Failed to pull image &lt;span class="s2"&gt;"ghcr.io/moelove/non-exist"&lt;/span&gt;: rpc error: code &lt;span class="o"&gt;=&lt;/span&gt; Unknown desc &lt;span class="o"&gt;=&lt;/span&gt; failed to pull and unpack image &lt;span class="s2"&gt;"ghcr.io/moelove/non-exist:latest"&lt;/span&gt;: failed to resolve reference &lt;span class="s2"&gt;"ghcr.io/moelove/non-exist:latest"&lt;/span&gt;: failed to authorize: failed to fetch anonymous token: unexpected status: 403 Forbidden
  Warning  Failed     2m21s &lt;span class="o"&gt;(&lt;/span&gt;x4 over 3m59s&lt;span class="o"&gt;)&lt;/span&gt;  kubelet            Error: ErrImagePull
  Warning  Failed     2m9s &lt;span class="o"&gt;(&lt;/span&gt;x6 over 3m58s&lt;span class="o"&gt;)&lt;/span&gt;   kubelet            Error: ImagePullBackOff
  Normal   BackOff    115s &lt;span class="o"&gt;(&lt;/span&gt;x7 over 3m58s&lt;span class="o"&gt;)&lt;/span&gt;   kubelet            Back-off pulling image &lt;span class="s2"&gt;"ghcr.io/moelove/non-exist"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can find that the output here is different from the previous Pod running correctly. The main difference is in the column Age Here we see output 115s (x7 over 3m58s)&lt;/p&gt;

&lt;p&gt;Its meaning means: This type of event has occurred 7 times in 3m58s, and the most recent one occurred before&lt;/p&gt;

&lt;p&gt;But when we went to kubectl get events directly, we did not see 7 repeated events. This shows that Kubernetes will automatically merge duplicate events into .&lt;/p&gt;

&lt;p&gt;Select the last Event (the method has been described in the previous content) and output its content in YAML format:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;(&lt;/span&gt;MoeLove&lt;span class="o"&gt;)&lt;/span&gt; ➜ kubectl &lt;span class="nt"&gt;-n&lt;/span&gt; moelove get events non-exist-d9ddbdd84-tnrhd.16c4fce570cfba46 &lt;span class="nt"&gt;-o&lt;/span&gt; yaml
apiVersion: v1
count: 43
eventTime: null
firstTimestamp: &lt;span class="s2"&gt;"2021-12-28T19:57:06Z"&lt;/span&gt;
involvedObject:
  apiVersion: v1
  fieldPath: spec.containers&lt;span class="o"&gt;{&lt;/span&gt;non-exist&lt;span class="o"&gt;}&lt;/span&gt;
  kind: Pod
  name: non-exist-d9ddbdd84-tnrhd
  namespace: moelove
  resourceVersion: &lt;span class="s2"&gt;"333366"&lt;/span&gt;
  uid: 33045163-146e-4282-b559-fec19a189a10
kind: Event
lastTimestamp: &lt;span class="s2"&gt;"2021-12-28T18:07:14Z"&lt;/span&gt;
message: Back-off pulling image &lt;span class="s2"&gt;"ghcr.io/moelove/non-exist"&lt;/span&gt;
metadata:
  creationTimestamp: &lt;span class="s2"&gt;"2021-12-28T19:57:06Z"&lt;/span&gt;
  name: non-exist-d9ddbdd84-tnrhd.16c4fce570cfba46
  namespace: moelove
  resourceVersion: &lt;span class="s2"&gt;"334638"&lt;/span&gt;
  uid: 60708be0-23b9-481b-a290-dd208fed6d47
reason: BackOff
reportingComponent: &lt;span class="s2"&gt;""&lt;/span&gt;
reportingInstance: &lt;span class="s2"&gt;""&lt;/span&gt;
&lt;span class="nb"&gt;source&lt;/span&gt;:
  component: kubelet
  host: kind-worker3
&lt;span class="nb"&gt;type&lt;/span&gt;: Normal
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here we can see that the field includes a count field, which indicates how many times the event of the same type has occurred. And firstTimestamp and lastTimestamp respectively represent the time of the last occurrence of this event for the first time. This also explains the duration of the events in the previous output.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understand Events thoroughly
&lt;/h2&gt;

&lt;p&gt;The following content is a random selection from Events, we can see some of the field information it contains:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;count&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
&lt;span class="na"&gt;eventTime&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;
&lt;span class="na"&gt;firstTimestamp&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2021-12-28T19:31:13Z"&lt;/span&gt;
&lt;span class="na"&gt;involvedObject&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apps/v1&lt;/span&gt;
  &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ReplicaSet&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;redis-687967dbc5&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;moelove&lt;/span&gt;
  &lt;span class="na"&gt;resourceVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;330227"&lt;/span&gt;
  &lt;span class="na"&gt;uid&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;11e98a9d-9062-4ccb-92cb-f51cc74d4c1d&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Event&lt;/span&gt;
&lt;span class="na"&gt;lastTimestamp&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2021-12-28T19:31:13Z"&lt;/span&gt;
&lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Created&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;pod:&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;redis-687967dbc5-27vmr'&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;creationTimestamp&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2021-12-28T19:31:13Z"&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;redis-687967dbc5.16c4fb7bde6b54c4&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;moelove&lt;/span&gt;
  &lt;span class="na"&gt;resourceVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;330231"&lt;/span&gt;
  &lt;span class="na"&gt;uid&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;8e37ec1e-b3a1-420c-96d4-3b3b2995c300&lt;/span&gt;
&lt;span class="na"&gt;reason&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;SuccessfulCreate&lt;/span&gt;
&lt;span class="na"&gt;reportingComponent&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;
&lt;span class="na"&gt;reportingInstance&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;
&lt;span class="na"&gt;source&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;component&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;replicaset-controller&lt;/span&gt;
&lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Normal&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The meanings of the main fields are as follows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;count: Indicates how many times the current similar event has occurred (described earlier)&lt;/li&gt;
&lt;li&gt;involvedObject: The resource object directly related to this event (introduced above), the structure is as follows:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;ObjectReference&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Kind&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;Namespace&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;Name&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;UID&lt;/span&gt; &lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UID&lt;/span&gt;
    &lt;span class="n"&gt;APIVersion&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;ResourceVersion&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;FieldPath&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;source: directly related components, the structure is as follows:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;EventSource&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Component&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;Host&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Reason: A simple summary (or a fixed code), which is more suitable for filtering conditions, mainly for machine readable. There are currently more than 50 such codes;&lt;/li&gt;
&lt;li&gt;message: give a detailed description that is easier for people to understand&lt;/li&gt;
&lt;li&gt;type: Currently there are only Normal and Warning , and their meanings are also written in the source code:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// staging/src/k8s.io/api/core/v1/types.go&lt;/span&gt;
&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="c"&gt;// Information only and will not cause any problems&lt;/span&gt;
    &lt;span class="n"&gt;EventTypeNormal&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Normal"&lt;/span&gt;
    &lt;span class="c"&gt;// These events are to warn that something might go wrong&lt;/span&gt;
    &lt;span class="n"&gt;EventTypeWarning&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Warning"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Therefore, when we collect these Events as tracing source , we can classify them involvedObject , and sort them by time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summarize
&lt;/h2&gt;

&lt;p&gt;n this article, I mainly use two examples, a properly deployed Deploy, and a Deploy that uses a non-existent image deployment, to introduce the actual function of the Events object and the meaning of each field in depth.&lt;/p&gt;

&lt;p&gt;For Kubernetes, Events contain a lot of useful information, but this information does not have any impact on Kubernetes, and they are not actual Kubernetes logs. By default, the logs in Kubernetes will be cleaned up after 1 hour in order to release the resource occupation of etcd.&lt;/p&gt;

&lt;p&gt;So in order to better let the cluster administrator know what happened, in the production environment, we usually collect the events of the Kubernetes cluster. The tool I personally recommend is: &lt;a href="https://github.com/opsgenie/kubernetes-event-exporter" rel="noopener noreferrer"&gt;https://github.com/opsgenie/kubernetes-event-exporter&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Of course, you can also follow my previous article &lt;a href="https://segmentfault.com/a/1190000040238160/en" rel="noopener noreferrer"&gt;"A More Elegant Kubernetes Cluster Event Measurement Scheme"&lt;/a&gt; , using Jaeger to use tracing to collect events in the Kubernetes cluster and display them.&lt;/p&gt;

&lt;p&gt;Welcome to subscribe to my article 【MoeLove】&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>go</category>
      <category>observability</category>
    </item>
  </channel>
</rss>
