DEV Community

Cover image for Hardening Kubernetes: A Practical Guide to EKS Security with Terraform and Kyverno

Hardening Kubernetes: A Practical Guide to EKS Security with Terraform and Kyverno

In this post, we will explore how to secure an Amazon EKS cluster by applying infrastructure-as-code best practices and policy-driven guardrails. We will use Terraform to provision our infrastructure and Kyverno to enforce security policies at the cluster level.

1. The Foundation: Infrastructure as Code

To minimize our attack surface, we will deploy a private EKS cluster. The control plane will be inaccessible from the public internet, forcing all management traffic through a secure VPN tunnel.

Our Terraform setup includes:

  • VPC Networking: A /16 VPC with three /24 private subnets and one public subnet for ingress.

  • Bastion-OpenVPN: A Terraform module to provide a secure gateway into our private environment.

  • EKS NodeGroups: Managed worker nodes with defined instance types.

Note: This setup is for demonstration. For production-grade architectures, always refer to aws-ia to align with AWS best practices.

2. Establishing Secure Access

Because the EKS API server resides in a private subnet, we cannot reach it directly from our local machine. We use the Bastion host as an intermediary.

Connecting via OpenVPN:

  1. Generate Credentials: Access your bastion host and run: sudo /usr/local/bin/generate-client-cert.sh <client-name>.
  2. Retrieve Config: Pull the generated .ovpn file from S3: aws s3 cp s3://<bucket-name>/clients/<client-name>.ovpn .
  3. Configure Routing: Update your .ovpn file to include the route to your VPC CIDR:
route <VPC-CIDR> <SUBNET-MASK>
Enter fullscreen mode Exit fullscreen mode

4. Connect: Run sudo openvpn --config <client-name>.ovpn.

Once the tunnel is active, you can interact with the cluster via kubectl:

aws eks update-kubeconfig --name <CLUSTER_NAME>
kubectl get nodes
Enter fullscreen mode Exit fullscreen mode

The result should look similar to this:

NAME                                              STATUS   ROLES    AGE   VERSION
ip-172-xx-yy-zzz.aws-region.compute.internal      Ready    <none>   21h   v1.34.4-eks-f69f56f
Enter fullscreen mode Exit fullscreen mode

3. Policy-as-Code with Kyverno

Infrastructure security is only half the battle. We also need guardrails for the workloads running inside the cluster. Kyverno allows us to manage these policies as Kubernetes objects.

Installing the Policy Suite

We will deploy Kyverno and the policy-reporter for a centralized security dashboard:

# Install Kyverno
helm repo add kyverno https://kyverno.github.io/kyverno/
helm install kyverno --namespace kyverno --create-namespace kyverno/kyverno

# Install Policy Reporter
helm install policy-reporter policy-reporter/policy-reporter \
  --create-namespace --namespace policy-reporter \
  --set ui.enabled=true --set kyvernoPlugin.enabled=true
Enter fullscreen mode Exit fullscreen mode

Testing Guardrails

Kyverno operates in two primary modes:

  • Enforce: Automatically modifies incoming requests (e.g., adding security contexts) to comply with security standards.

  • Audit: Monitors and reports policy violations without necessarily blocking the workload.

Example: Enforcing PSS (Pod Security Standards)

If we apply a mutate policy that enforces a "Restricted" security context, an Nginx pod might fail if it attempts to run as root.

  • Mutation: When we apply the PSS Restricted policy, our Nginx pod may enter a CrashLoopBackOff because it violates the enforced security constraints. A more compatible container, like busybox, will run successfully.

  • Audit: By using validationFailureAction: Audit, we can track non-compliant pods without breaking existing applications. This is the recommended strategy when rolling out security policies to existing production clusters.

4.Next Steps: Observability

Security is an ongoing process. To keep your cluster healthy and secure, implement observability using AWS-native tools like Amazon Managed Service for Prometheus (AMP) and AWS Distro for OpenTelemetry (ADOT).

Check out the terraform-aws-observability-accelerator to get started.

Final Reminder: You can find the full source code for this demonstration in my GitHub repository. Don't forget to run terraform destroy when you are finished to avoid unnecessary AWS costs!


Appendix

To get the policy-report-ui dashboard

  1. run kubectl port-forward service/policy-reporter-ui 8082:8080 -n policy-reporter
  2. access from the browser via http://localhost:8082.

Mutate policy example taken from Kyverno

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: apply-pss-restricted-profile
  annotations:
    policies.kyverno.io/title: Apply PSS Restricted Profile
    policies.kyverno.io/category: Other, PSP Migration
    kyverno.io/kyverno-version: 1.6.2
    kyverno.io/kubernetes-version: "1.23"
    policies.kyverno.io/subject: Pod
    policies.kyverno.io/description: Pod Security Standards define the fields and their options which are allowable for Pods to achieve certain security best practices. While these are typically validation policies, workloads will either be accepted or rejected based upon what has already been defined. It is also possible to mutate incoming Pods to achieve the desired PSS level rather than reject. This policy sets all the fields necessary to pass the PSS Restricted profile. Note that it does not attempt to remove non-compliant volumes and volumeMounts. Additional policies may be employed for this purpose.
spec:
  rules:
    - name: add-pss-fields
      match:
        any:
          - resources:
              kinds:
                - Pod
      mutate:
        patchStrategicMerge:
          spec:
            securityContext:
              seccompProfile:
                type: RuntimeDefault
              runAsNonRoot: true
              runAsUser: 1000
              runAsGroup: 3000
              fsGroup: 2000
            containers:
              - (name): "?*"
                securityContext:
                  privileged: false
                  capabilities:
                    drop:
                      - ALL
                  allowPrivilegeEscalation: false
Enter fullscreen mode Exit fullscreen mode

nginx pod yaml

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    run: nginx
  name: nginx
spec:
  containers:
  - image: nginx
    name: nginx
    resources: {}
  dnsPolicy: ClusterFirst
  restartPolicy: Always
status: {}
Enter fullscreen mode Exit fullscreen mode

busybox pod yaml

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    run: busybox-0
  name: busybox-0
spec:
  containers:
  - command:
    - sleep
    - "3600"
    image: busybox
    name: busybox-0
    resources: {}
  dnsPolicy: ClusterFirst
  restartPolicy: Always
status: {}
Enter fullscreen mode Exit fullscreen mode

validate policy example taken from Kyverno

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: pss-audit
spec:
  validationFailureAction: Audit
  background: true
  rules:
    - name: check-run-as-non-root
      match:
        resources:
          kinds:
            - Pod
      validate:
        message: "Running as root is not allowed"
        pattern:
          spec:
            securityContext:
              runAsNonRoot: true
Enter fullscreen mode Exit fullscreen mode

busybox pod complying with validate policy

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    run: busybox-1
  name: busybox-1
spec:
  containers:
  - command:
    - sleep
    - "3600"
    image: busybox
    name: busybox-1
    resources: {}
  dnsPolicy: ClusterFirst
  restartPolicy: Always
status: {}
Enter fullscreen mode Exit fullscreen mode

Top comments (0)