DEV Community

Giancarlo Rocha
Giancarlo Rocha

Posted on

Fix Create Failed Status On AWS EKS Node Group

This is a possible fix for EKS Create failed status on EKS Node Groups.

Here i use:

  • Custom AMI with nvidia drivers
  • Custom user data script to join the eks cluster

AWS Launch Template Custom AMI Documentation

The --apiserver-endpoint, --b64-cluster-ca, and --dns-cluster-ip arguments are optional.

The documentation states that those parameters are optional, and they really are, but lately i've been having an issue where my nodes couldn't join my cluster. I've tried lots of things to get this fixed, and what ended up working was providing those 3 arguments in user data script.

Here's the snippet for terraform:

On cluster module i've exported the values of the parameters i need:

# Cluster definition as a module
resource "aws_eks_cluster" "cluster" {
  name     = var.name
  ...
}

# This parses the correct format for the DNS address
locals {
  service_cidr = aws_eks_cluster.cluster.kubernetes_network_config[0].service_ipv4_cidr
  dns_cluster_ip = cidrhost(local.service_cidr, 10)
}

output "certificate_authority_data" {
  value = aws_eks_cluster.cluster.certificate_authority[0].data
}

output "cluster_endpoint" {
  value = aws_eks_cluster.cluster.endpoint
}

output "dns_cluster_ip" {
  value = local.dns_cluster_ip
}
Enter fullscreen mode Exit fullscreen mode

On my launch template i pass those values as context for the user data:

resource "aws_launch_template" "eks_node" {
  ...
  user_data = base64encode(templatefile("${path.module}/bootstrap.sh", {
    name   = var.name
    labels = join(",", [for k in sort(keys(var.labels)) : "${k}=${var.labels[k]}"])
    taints = join(",", [
      for key in sort([for t in var.taints : t.key]) :
      "${key}=${[for t in var.taints : t.value if t.key == key][0]}:${[for t in var.taints : t.effect if t.key == key][0]}"
    ])
    cluster_name = var.cluster_name
    dns_cluster_ip = var.dns_cluster_ip
    certificate_authority_data = var.certificate_authority_data
    cluster_endpoint = var.cluster_endpoint
  }))
  ...
}

Enter fullscreen mode Exit fullscreen mode

And on my bootstrap script i use those values:

#!/bin/bash
set -o xtrace

extra_args=""

if [ -n "${labels}" ]; then
  extra_args="$extra_args --node-labels=${labels}"
fi

if [ -n "${taints}" ]; then
  extra_args="$extra_args --register-with-taints=${taints}"
fi

/etc/eks/bootstrap.sh ${cluster_name} \
  --b64-cluster-ca "${certificate_authority_data}" \
  --apiserver-endpoint "${cluster_endpoint}" \
  --dns-cluster-ip "${dns_cluster_ip}" \
  --kubelet-extra-args "$extra_args"
Enter fullscreen mode Exit fullscreen mode

Top comments (0)