<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Vincent</title>
    <description>The latest articles on DEV Community by Vincent (@zingboum).</description>
    <link>https://dev.to/zingboum</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F850236%2Fcb818ef3-77a3-4dfa-9241-c785ecd4225f.png</url>
      <title>DEV Community: Vincent</title>
      <link>https://dev.to/zingboum</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/zingboum"/>
    <language>en</language>
    <item>
      <title>Surviving N4 stockouts in GKE</title>
      <dc:creator>Vincent</dc:creator>
      <pubDate>Sat, 13 Jun 2026 15:55:44 +0000</pubDate>
      <link>https://dev.to/zingboum/surviving-n4-stockouts-in-gke-7fe</link>
      <guid>https://dev.to/zingboum/surviving-n4-stockouts-in-gke-7fe</guid>
      <description>&lt;h2&gt;
  
  
  The problem I ran into
&lt;/h2&gt;

&lt;p&gt;I was bringing up the startup infrastructure for a solution standardized on the &lt;strong&gt;N4&lt;/strong&gt; machine series. N4 is a great general-purpose default for us, so the design assumed N4 capacity would be there when we asked for it.&lt;/p&gt;

&lt;p&gt;It wasn't. In several of our target regions I simply could not get N4 nodes when I needed them. New nodes wouldn't come up during scale-up events, and worse, I couldn't even create the &lt;em&gt;initial&lt;/em&gt; infrastructure in some zones. The cluster autoscaler kept logging the now-familiar Compute Engine signal:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ZONE_RESOURCE_POOL_EXHAUSTED
The zone 'projects/PROJECT_ID/zones/ZONE' does not have enough resources
available to fulfill the request. Try a different zone, or try again later.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is not a quota problem and not a config problem. It's a capacity problem: GCP didn't have the N4 hardware free in that zone at that moment. A few things made this clear to me as I dug in.&lt;/p&gt;

&lt;p&gt;Google's own resource-availability troubleshooting page is explicit that these errors happen when you request resources in a zone that can't currently accommodate them, that they're unrelated to your quota, and that they only apply to &lt;em&gt;new&lt;/em&gt; resource requests — existing VMs keep running fine. Their recommended mitigations are to retry in another zone, retry in another region, retry later, or &lt;strong&gt;change the hardware configuration you're asking for&lt;/strong&gt;. (&lt;a href="https://docs.cloud.google.com/compute/docs/troubleshooting/troubleshooting-resource-availability" rel="noopener noreferrer"&gt;Troubleshooting resource availability errors&lt;/a&gt;)&lt;/p&gt;

&lt;p&gt;The GKE ComputeClass troubleshooting docs say the same thing from the cluster's point of view: when GKE can't provision nodes for a priority rule — for example because of &lt;code&gt;ZONE_RESOURCE_POOL_EXHAUSTED&lt;/code&gt; or &lt;code&gt;QUOTA_EXCEEDED&lt;/code&gt; from Compute Engine — the autoscaler immediately moves on to the next rule, with no waiting period (the exception being TPUs and Flex Start). (&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/troubleshooting/custom-computeclass" rel="noopener noreferrer"&gt;Troubleshoot custom ComputeClass issues&lt;/a&gt;)&lt;/p&gt;

&lt;p&gt;And I'm clearly not alone. The lack of capacity for specific machine families in specific zones is a long-running, well-documented reality on GCP. You can find it discussed in places like the GCE discussion group thread on &lt;a href="https://groups.google.com/g/gce-discussion/c/Lfyk38giqK8" rel="noopener noreferrer"&gt;resources not being available to fulfill the request&lt;/a&gt;, the &lt;a href="https://groups.google.com/g/gce-discussion/c/vlKpp3BlYjA" rel="noopener noreferrer"&gt;ZONE_RESOURCE_POOL_EXHAUSTED thread&lt;/a&gt;, and the Google developer-forum report of &lt;a href="https://discuss.google.dev/t/zone-resource-pool-and-quotas-exhausted/151884" rel="noopener noreferrer"&gt;not being able to provision anything on GKE&lt;/a&gt; across multiple European zones. Even Google's &lt;a href="https://cloud.google.com/blog/products/containers-kubernetes/introducing-new-gke-custom-compute-class-api" rel="noopener noreferrer"&gt;custom compute class launch blog&lt;/a&gt; quotes Delivery Hero saying the thing I wanted to be able to say: that with fallback rules they stopped having to worry about instance availability at all.&lt;/p&gt;

&lt;p&gt;So I stopped trying to force N4 and instead made my infrastructure resilient to N4 being temporarily unavailable.&lt;/p&gt;

&lt;h2&gt;
  
  
  The fix: a dedicated cluster with a cluster-wide default ComputeClass
&lt;/h2&gt;

&lt;p&gt;My solution doesn't bolt a ComputeClass onto an existing shared cluster, and it deliberately does &lt;strong&gt;not&lt;/strong&gt; attach the class to individual workloads with per-Pod selectors. Instead, &lt;strong&gt;I provision a specific, dedicated GKE cluster for the whole solution&lt;/strong&gt;, and I make a single custom ComputeClass the &lt;strong&gt;cluster-level default&lt;/strong&gt;. Every Pod that lands on that cluster inherits the N4→C4 behavior automatically, whether or not it asks for it. The workload manifests stay completely clean — no &lt;code&gt;nodeSelector&lt;/code&gt;, no per-namespace labels, nothing for app teams to remember.&lt;/p&gt;

&lt;p&gt;A ComputeClass is a Kubernetes Custom Resource (CRD, &lt;code&gt;apiVersion: cloud.google.com/v1&lt;/code&gt;, &lt;code&gt;kind: ComputeClass&lt;/code&gt;) that lets me declare an ordered list of node configurations — "priorities" — that GKE walks through when it needs to scale up. (&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/concepts/about-custom-compute-classes" rel="noopener noreferrer"&gt;About custom ComputeClasses&lt;/a&gt;)&lt;/p&gt;

&lt;p&gt;Two features make this work:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Fallback compute priorities.&lt;/strong&gt; I list N4 first and C4 second. If GKE can't get N4, it doesn't leave my Pods Pending — it falls straight through to C4. C4 is the natural sibling to reach for: it's a current-generation Intel general-purpose series with a comparable shape to N4, so my workloads behave consistently on it.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Active migration back to the preferred option.&lt;/strong&gt; With &lt;code&gt;activeMigration.optimizeRulePriority: true&lt;/code&gt;, once N4 capacity reappears in my location (capacity frees up, or my quota goes up), GKE creates a new N4 node, then cordons and drains the C4 node it had been using as a fallback. The fleet self-heals back to N4 without me touching anything. This N4→C4 example is, in fact, the exact scenario Google documents for active migration. (&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/concepts/about-custom-compute-classes" rel="noopener noreferrer"&gt;About custom ComputeClasses&lt;/a&gt;)&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The reason the &lt;em&gt;dedicated cluster + cluster-level default&lt;/em&gt; combination matters — and isn't just a stylistic choice — is covered in the next section. The short version: it's the only configuration where my N4 fail-back actually fires for the whole solution.&lt;/p&gt;

&lt;h3&gt;
  
  
  The default ComputeClass I deployed
&lt;/h3&gt;

&lt;p&gt;To make a custom ComputeClass the &lt;strong&gt;cluster-wide default&lt;/strong&gt;, you give it the reserved name &lt;code&gt;default&lt;/code&gt;. GKE then uses its priority rules as the autoscaling rules for any Pod that doesn't select a different class. (&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/concepts/about-compute-classes" rel="noopener noreferrer"&gt;About GKE ComputeClasses&lt;/a&gt;)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;cloud.google.com/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ComputeClass&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="c1"&gt;# The reserved name "default" makes this the cluster-wide default class.&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;default&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;priorities&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# 1) Preferred: N4 on-demand&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;machineFamily&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;n4&lt;/span&gt;
      &lt;span class="na"&gt;spot&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
    &lt;span class="c1"&gt;# 2) Fallback: C4 on-demand when N4 is exhausted&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;machineFamily&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;c4&lt;/span&gt;
      &lt;span class="na"&gt;spot&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;

  &lt;span class="c1"&gt;# When N4 capacity returns, replace C4 nodes with N4 nodes over time.&lt;/span&gt;
  &lt;span class="c1"&gt;# This is what makes the C4 fallback temporary by design.&lt;/span&gt;
  &lt;span class="na"&gt;activeMigration&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;optimizeRulePriority&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;

  &lt;span class="c1"&gt;# Let GKE create the node pools for me.&lt;/span&gt;
  &lt;span class="na"&gt;nodePoolAutoCreation&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;enabled&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;

  &lt;span class="c1"&gt;# Be explicit: if NEITHER N4 nor C4 can be provisioned, keep Pods&lt;/span&gt;
  &lt;span class="c1"&gt;# Pending rather than silently scaling up some default machine type.&lt;/span&gt;
  &lt;span class="na"&gt;whenUnsatisfiable&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;DoNotScaleUp&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I set &lt;code&gt;whenUnsatisfiable&lt;/code&gt; &lt;strong&gt;explicitly&lt;/strong&gt;. The default behavior changed at GKE 1.33: in older versions an unsatisfiable rule meant "scale up anyway" with the cluster's default machine type, which for me would have meant surprise E2 nodes instead of a clear Pending signal. Setting it explicitly means an upgrade won't silently change how the cluster behaves. (&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/concepts/about-custom-compute-classes" rel="noopener noreferrer"&gt;About custom ComputeClasses&lt;/a&gt;) I use &lt;code&gt;DoNotScaleUp&lt;/code&gt; where I'd rather alert on Pending Pods; if you'd rather keep running on &lt;em&gt;anything&lt;/em&gt;, use &lt;code&gt;ScaleUpAnyway&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Because this is the cluster default, &lt;strong&gt;workloads need no changes at all&lt;/strong&gt; — there are no &lt;code&gt;nodeSelector&lt;/code&gt; blocks and no namespace labels to apply. One caution that informed the dedicated-cluster decision: GKE recommends you use &lt;em&gt;either&lt;/em&gt; ComputeClasses &lt;em&gt;or&lt;/em&gt; individual node selectors, but not both, since mixing them causes unexpected scheduling. A dedicated cluster where the default class governs everything avoids that conflict entirely. (&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/concepts/about-compute-classes" rel="noopener noreferrer"&gt;About GKE ComputeClasses&lt;/a&gt;)&lt;/p&gt;

&lt;h2&gt;
  
  
  The default ComputeClass challenge in versions
&lt;/h2&gt;

&lt;p&gt;This is the part that drove the whole design, and where I lost the most time, so I'm being emphatic about it.&lt;/p&gt;

&lt;p&gt;Custom ComputeClasses themselves are old enough now (supported since GKE &lt;strong&gt;1.30.3-gke.1451000&lt;/strong&gt;) that it's easy to assume "defaults" came with them. They didn't. &lt;strong&gt;Setting a ComputeClass as a default is a much newer capability, and the two kinds of default have different version floors — and, critically, different behavior.&lt;/strong&gt; (&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/run-pods-default-compute-classes" rel="noopener noreferrer"&gt;Apply ComputeClasses to Pods by default&lt;/a&gt;)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cluster-level default&lt;/strong&gt; (the class named &lt;code&gt;default&lt;/code&gt;): requires GKE &lt;strong&gt;1.33.1-gke.1744000&lt;/strong&gt; or later.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Namespace-level default&lt;/strong&gt; (a &lt;code&gt;cloud.google.com/default-compute-class&lt;/code&gt; label on the namespace), applied to &lt;strong&gt;only non-DaemonSet Pods&lt;/strong&gt;: requires GKE &lt;strong&gt;1.33.1-gke.1788000&lt;/strong&gt; or later.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The behavioral difference is the trap, and it's the reason a dedicated cluster with a &lt;em&gt;cluster-level&lt;/em&gt; default is the right shape for my solution rather than a namespace label on a shared cluster:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;With a &lt;strong&gt;cluster-level&lt;/strong&gt; default, &lt;code&gt;activeMigration.optimizeRulePriority&lt;/code&gt; &lt;strong&gt;does&lt;/strong&gt; fire — if lower-priority C4 nodes exist and GKE can now create N4 nodes, it migrates. That's exactly my N4 fail-back.&lt;/li&gt;
&lt;li&gt;With a &lt;strong&gt;namespace-level&lt;/strong&gt; default, active migration &lt;strong&gt;is not triggered at all&lt;/strong&gt;, because GKE only injects the ComputeClass selector into &lt;em&gt;newly created&lt;/em&gt; Pods. Existing Pods are never re-created with the selector, so they never migrate back to N4. (&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/concepts/about-compute-classes" rel="noopener noreferrer"&gt;About GKE ComputeClasses&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In other words, if I'd taken the easy path of labeling a namespace on an existing cluster, my Pods would have fallen back to C4 during a stockout and &lt;strong&gt;stayed on C4 forever&lt;/strong&gt;, even after N4 came back. The fail-back — the whole point — would silently not work. The cluster-level default is the only configuration that gives me both the automatic inheritance &lt;em&gt;and&lt;/em&gt; the migration back to N4, and that pushed me to stand up a cluster dedicated to this solution where naming a class &lt;code&gt;default&lt;/code&gt; is safe and intentional.&lt;/p&gt;

&lt;p&gt;Stacking the requirements, the &lt;strong&gt;binding version floor for my solution&lt;/strong&gt; is the combination of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;cluster-level default ComputeClass → &lt;strong&gt;1.33.1-gke.1744000&lt;/strong&gt;, and&lt;/li&gt;
&lt;li&gt;ComputeClass-driven node-pool auto-creation &lt;em&gt;without&lt;/em&gt; cluster-level node auto-provisioning → &lt;strong&gt;1.33.3-gke.1136000&lt;/strong&gt; on the &lt;strong&gt;Rapid&lt;/strong&gt; release channel.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So I target &lt;strong&gt;≥ 1.33.3-gke.1136000 on Rapid&lt;/strong&gt;, which clears both. Below the default-CC floor entirely (anything before 1.33.1-gke.1744000), you simply &lt;em&gt;cannot&lt;/em&gt; set a cluster-wide default — you'd be forced back into per-workload selectors or namespace labels, and the latter breaks the fail-back as described above.&lt;/p&gt;

&lt;h3&gt;
  
  
  Version requirements I treat as mandatory
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Capability I depend on&lt;/th&gt;
&lt;th&gt;Minimum GKE version&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Custom ComputeClasses at all (the base feature)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1.30.3-gke.1451000&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Cluster-level default ComputeClass&lt;/strong&gt; (class named &lt;code&gt;default&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1.33.1-gke.1744000&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Namespace-level default for non-DaemonSet Pods (I deliberately avoid this)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1.33.1-gke.1788000&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Node-pool auto-creation &lt;strong&gt;from the ComputeClass without&lt;/strong&gt; cluster-level NAP&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;1.33.3-gke.1136000&lt;/strong&gt;, Rapid channel&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Exact custom machine types in &lt;code&gt;priorities&lt;/code&gt; + Kubernetes labels set by the class&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1.33.2-gke.1111000&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;nodePoolConfig.imageType&lt;/code&gt; in a ComputeClass&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1.32.4-gke.1198000&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;(Sources: &lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/run-pods-default-compute-classes" rel="noopener noreferrer"&gt;default ComputeClasses&lt;/a&gt;, &lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/node-attributes-compute-classes" rel="noopener noreferrer"&gt;node-attributes ComputeClass how-to&lt;/a&gt;, &lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/node-auto-provisioning" rel="noopener noreferrer"&gt;node pool auto-creation&lt;/a&gt;, &lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/release-notes-new-features" rel="noopener noreferrer"&gt;GKE release notes&lt;/a&gt;.)&lt;/p&gt;

&lt;p&gt;One more trap on top of the version floors: &lt;strong&gt;&lt;code&gt;nodePoolAutoCreation.enabled: true&lt;/code&gt; is not enough on its own on older clusters.&lt;/strong&gt; Before &lt;strong&gt;1.33.3-gke.1136000&lt;/strong&gt;, you &lt;em&gt;also&lt;/em&gt; have to enable cluster-level node auto-provisioning (NAP), or GKE won't create the node pools for your class and Pods just sit Pending — which looks exactly like the stockout you were trying to solve. At/after that version on Rapid you can let the ComputeClass drive auto-creation directly. (&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/concepts/about-custom-compute-classes" rel="noopener noreferrer"&gt;About custom ComputeClasses&lt;/a&gt;)&lt;/p&gt;

&lt;p&gt;On the tooling side, the versions I standardized on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Terraform CLI&lt;/strong&gt; ≥ 1.5 (I run 1.15.x).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;hashicorp/google&lt;/code&gt; provider 7.x&lt;/strong&gt; — I pin &lt;code&gt;~&amp;gt; 7.36&lt;/code&gt;. (The 7.0 line has been GA since 2025; 7.36 was current when I wrote this.)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;hashicorp/kubernetes&lt;/code&gt; provider&lt;/strong&gt; ≥ 2.30 to apply the ComputeClass CRD through Terraform.&lt;/li&gt;
&lt;li&gt;If you use the community &lt;strong&gt;&lt;code&gt;terraform-google-modules/terraform-google-kubernetes-engine&lt;/code&gt;&lt;/strong&gt; module with its &lt;code&gt;enable_default_compute_class&lt;/code&gt; plumbing, it requires the &lt;strong&gt;google provider ≥ 7.10&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Terraform
&lt;/h2&gt;

&lt;p&gt;I provision the dedicated cluster and its default ComputeClass as code. Three pieces: providers (pinned), the dedicated cluster (with the version floor and autoscaling/NAP settings), and the &lt;code&gt;default&lt;/code&gt; ComputeClass manifest.&lt;/p&gt;

&lt;h3&gt;
  
  
  Providers and version pinning
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;terraform&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;required_version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"&amp;gt;= 1.5"&lt;/span&gt;

  &lt;span class="nx"&gt;required_providers&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;google&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"hashicorp/google"&lt;/span&gt;
      &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"~&amp;gt; 7.36"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="nx"&gt;kubernetes&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"hashicorp/kubernetes"&lt;/span&gt;
      &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"~&amp;gt; 2.30"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;provider&lt;/span&gt; &lt;span class="s2"&gt;"google"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;project&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;project_id&lt;/span&gt;
  &lt;span class="nx"&gt;region&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;region&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The dedicated cluster — pin the version, turn on autoscaling / NAP
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# A cluster built specifically and exclusively for this solution.&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"google_container_cluster"&lt;/span&gt; &lt;span class="s2"&gt;"solution"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"n4-solution-euw3"&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;region&lt;/span&gt;

  &lt;span class="c1"&gt;# Rapid channel keeps us on a version new enough for both the&lt;/span&gt;
  &lt;span class="c1"&gt;# cluster-level default ComputeClass AND ComputeClass-driven&lt;/span&gt;
  &lt;span class="c1"&gt;# node-pool auto-creation without cluster-level NAP.&lt;/span&gt;
  &lt;span class="nx"&gt;release_channel&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;channel&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"RAPID"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;# Insist on a control-plane version that supports the cluster-level&lt;/span&gt;
  &lt;span class="c1"&gt;# default ComputeClass and auto-creation. This is the binding floor.&lt;/span&gt;
  &lt;span class="nx"&gt;min_master_version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"1.33.3-gke.1136000"&lt;/span&gt;

  &lt;span class="nx"&gt;remove_default_node_pool&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="nx"&gt;initial_node_count&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

  &lt;span class="c1"&gt;# Cluster-level node auto-provisioning. REQUIRED if your control plane is&lt;/span&gt;
  &lt;span class="c1"&gt;# older than 1.33.3-gke.1136000; optional (but harmless) at/after it.&lt;/span&gt;
  &lt;span class="nx"&gt;cluster_autoscaling&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;enabled&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;

    &lt;span class="nx"&gt;resource_limits&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;resource_type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"cpu"&lt;/span&gt;
      &lt;span class="nx"&gt;minimum&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;
      &lt;span class="nx"&gt;maximum&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;512&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="nx"&gt;resource_limits&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;resource_type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"memory"&lt;/span&gt;
      &lt;span class="nx"&gt;minimum&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;
      &lt;span class="nx"&gt;maximum&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2048&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="nx"&gt;auto_provisioning_defaults&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;management&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;auto_repair&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
        &lt;span class="nx"&gt;auto_upgrade&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# A small autoscaling system pool so the Standard cluster satisfies the&lt;/span&gt;
&lt;span class="c1"&gt;# "at least one node pool has autoscaling enabled" requirement.&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"google_container_node_pool"&lt;/span&gt; &lt;span class="s2"&gt;"system"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"system"&lt;/span&gt;
  &lt;span class="nx"&gt;cluster&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;google_container_cluster&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;solution&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;region&lt;/span&gt;

  &lt;span class="nx"&gt;autoscaling&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;min_node_count&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="nx"&gt;max_node_count&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;node_config&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;machine_type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"e2-standard-4"&lt;/span&gt;
    &lt;span class="nx"&gt;oauth_scopes&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"https://www.googleapis.com/auth/cloud-platform"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The cluster-level default ComputeClass as a Terraform-managed manifest
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="s2"&gt;"google_client_config"&lt;/span&gt; &lt;span class="s2"&gt;"default"&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;

&lt;span class="nx"&gt;provider&lt;/span&gt; &lt;span class="s2"&gt;"kubernetes"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;host&lt;/span&gt;                   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"https://${google_container_cluster.solution.endpoint}"&lt;/span&gt;
  &lt;span class="nx"&gt;cluster_ca_certificate&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;base64decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nx"&gt;google_container_cluster&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;solution&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;master_auth&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;cluster_ca_certificate&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nx"&gt;token&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;google_client_config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;default&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;access_token&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"kubernetes_manifest"&lt;/span&gt; &lt;span class="s2"&gt;"default_compute_class"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;manifest&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;apiVersion&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"cloud.google.com/v1"&lt;/span&gt;
    &lt;span class="nx"&gt;kind&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ComputeClass"&lt;/span&gt;
    &lt;span class="nx"&gt;metadata&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="c1"&gt;# "default" =&amp;gt; cluster-wide default. Requires GKE 1.33.1-gke.1744000+.&lt;/span&gt;
      &lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"default"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="nx"&gt;spec&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;priorities&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;machineFamily&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"n4"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;spot&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;machineFamily&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"c4"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;spot&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;]&lt;/span&gt;
      &lt;span class="c1"&gt;# Fires for a CLUSTER-level default; would NOT fire for a namespace default.&lt;/span&gt;
      &lt;span class="nx"&gt;activeMigration&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;optimizeRulePriority&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="nx"&gt;nodePoolAutoCreation&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;enabled&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="nx"&gt;whenUnsatisfiable&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"DoNotScaleUp"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;depends_on&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;google_container_cluster&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;solution&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Gotcha worth knowing:&lt;/strong&gt; &lt;code&gt;kubernetes_manifest&lt;/code&gt; validates against the cluster's API at &lt;em&gt;plan&lt;/em&gt; time, so the &lt;code&gt;ComputeClass&lt;/code&gt; CRD must already exist on the control plane when you plan. On a brand-new cluster that means a two-phase apply (cluster first, then the manifest), or driving the &lt;code&gt;kubectl apply&lt;/code&gt; from a &lt;code&gt;null_resource&lt;/code&gt;/&lt;code&gt;local-exec&lt;/code&gt; against the rendered YAML. On GKE the CRD ships with the control plane, so once the cluster exists, the manifest applies cleanly.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Because this is the cluster default, there are &lt;strong&gt;no workload-side Terraform or manifest changes&lt;/strong&gt;. Anything scheduled on this dedicated cluster inherits N4→C4→N4 automatically.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'd tell my past self
&lt;/h2&gt;

&lt;p&gt;The N4 stockouts were never going to be "fixed" by retrying harder — capacity in a given zone isn't something I control. What I &lt;em&gt;could&lt;/em&gt; control was whether a missing N4 node meant a stalled rollout or a transparent fall-through to C4. Building a &lt;strong&gt;dedicated cluster&lt;/strong&gt; with a &lt;strong&gt;cluster-level default&lt;/strong&gt; ComputeClass gave me a declarative, GitOps-friendly way to express "prefer N4, accept C4, return to N4" for the entire solution at once — no per-workload wiring, and no risk of mixing class selection with node selectors.&lt;/p&gt;

&lt;p&gt;Three things genuinely mattered in practice: &lt;strong&gt;set &lt;code&gt;whenUnsatisfiable&lt;/code&gt; explicitly&lt;/strong&gt; so an upgrade never changes my failure mode behind my back; &lt;strong&gt;use a cluster-level default, not a namespace label&lt;/strong&gt;, because only the cluster-level default actually triggers the active migration back to N4; and &lt;strong&gt;pin the GKE version&lt;/strong&gt; to at least 1.33.3-gke.1136000 on Rapid, because the default-ComputeClass and auto-creation features I'm relying on simply don't exist on older clusters. Get those right and the stockouts stop being incidents and become a non-event.&lt;/p&gt;

</description>
      <category>gcp</category>
      <category>resources</category>
      <category>quota</category>
      <category>productivity</category>
    </item>
    <item>
      <title>IoT certificates for AWS with Terraform</title>
      <dc:creator>Vincent</dc:creator>
      <pubDate>Sat, 11 Nov 2023 19:02:15 +0000</pubDate>
      <link>https://dev.to/zingboum/iot-certificates-for-aws-with-terraform-c6</link>
      <guid>https://dev.to/zingboum/iot-certificates-for-aws-with-terraform-c6</guid>
      <description>&lt;p&gt;Today I am trying to simplify the deployment of small projects with little amount of things on AWS. Having worked with AWS in the console to create the certificates, I realized it's an easy way to have that working but using IaC would be a breeze compared to that.&lt;/p&gt;

&lt;h2&gt;
  
  
  Create certificate requests with Terraform
&lt;/h2&gt;

&lt;p&gt;Using certificates with &lt;code&gt;terraform&lt;/code&gt; is pretty simple:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight terraform"&gt;&lt;code&gt;&lt;span class="k"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"tls_private_key"&lt;/span&gt; &lt;span class="s2"&gt;"signed_1"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;algorithm&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"RSA"&lt;/span&gt;
  &lt;span class="nx"&gt;rsa_bits&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2048&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"tls_cert_request"&lt;/span&gt; &lt;span class="s2"&gt;"signed_1"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;private_key_pem&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;tls_private_key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;signed_1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;private_key_pem&lt;/span&gt;

  &lt;span class="nx"&gt;subject&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;organization&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"iot.rpi.com"&lt;/span&gt;
    &lt;span class="nx"&gt;common_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"thing1.iot.rpi.com"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Get a certificate from AWS IoT
&lt;/h2&gt;

&lt;p&gt;Using the &lt;code&gt;aws_iot_certificate&lt;/code&gt; we can get a certificate from the Certificate Signing Request (CSR) generated just before.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight terraform"&gt;&lt;code&gt;&lt;span class="k"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_iot_certificate"&lt;/span&gt; &lt;span class="s2"&gt;"iot_certificate"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;active&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="nx"&gt;csr&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;tls_cert_request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;signed_1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cert_request_pem&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Output the certificates to use them
&lt;/h2&gt;

&lt;p&gt;Then for our IoT thing, we will need the private key used to create our CSR, the certificate itself and the CA used to sign the certificate.&lt;/p&gt;

&lt;p&gt;We can output them and define them as sensitive, they will be available in the &lt;code&gt;terraform.tfstate&lt;/code&gt; file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight terraform"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Get a CA used to sign the certificate&lt;/span&gt;
&lt;span class="k"&gt;data&lt;/span&gt; &lt;span class="s2"&gt;"tls_certificate"&lt;/span&gt; &lt;span class="s2"&gt;"aws_root_ca"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;url&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"https://www.amazontrust.com/repository/AmazonRootCA1.pem"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;output&lt;/span&gt; &lt;span class="s2"&gt;"thing_cert_pem"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_iot_certificate&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;iot_certificate&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;certificate_pem&lt;/span&gt;
  &lt;span class="nx"&gt;sensitive&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;output&lt;/span&gt; &lt;span class="s2"&gt;"thing_key_pem"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;tls_private_key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;signed_1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;private_key_pem&lt;/span&gt;
  &lt;span class="nx"&gt;sensitive&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;output&lt;/span&gt; &lt;span class="s2"&gt;"aws_root_ca_pem"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;sensitive&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tls_certificate&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;aws_root_ca&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The result resides in the state file, that we can parse to get the values:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/sh&lt;/span&gt;

&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; certs

&lt;span class="nv"&gt;TERRAFORM_STATE_FILE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;terraform.tfstate
&lt;span class="nv"&gt;CERT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;jq &lt;span class="s2"&gt;".outputs.thing_cert_pem.value"&lt;/span&gt; &amp;lt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;TERRAFORM_STATE_FILE&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; | &lt;span class="nb"&gt;tr&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'"'&lt;/span&gt; &lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;jq &lt;span class="s2"&gt;".outputs.thing_key_pem.value"&lt;/span&gt; &amp;lt; &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;TERRAFORM_STATE_FILE&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;| &lt;span class="nb"&gt;tr&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'"'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# shellcheck disable=SC2039&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$CERT&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; certs/thing1.crt.pem
&lt;span class="c"&gt;# shellcheck disable=SC2039&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$KEY&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; certs/thing1.key.pem
wget &lt;span class="nt"&gt;-O&lt;/span&gt; certs/AmazonRootCA1.pem &lt;span class="s2"&gt;"https://www.amazontrust.com/repository/AmazonRootCA1.pem"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The PEM files will then be available in the certs folder and you just have to use them !&lt;/p&gt;




&lt;p&gt;If you want to help me, buy me a coffe: &lt;a href="https://www.buymeacoffee.com/u6gr0cy"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--6Oibfu3K--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://cdn.buymeacoffee.com/buttons/default-orange.png" alt="Buy Me A Coffee" width="434" height="100"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Thank you for reading&lt;/p&gt;

</description>
      <category>iot</category>
      <category>awsiot</category>
      <category>terraform</category>
    </item>
  </channel>
</rss>
