DEV Community

Manoj K R
Manoj K R

Posted on

Scaling DNS in Multi-Cluster Kubernetes with ExternalDNS (AWS Route 53 )

Scaling Kubernetes DNS with ExternalDNS and Route 53

As our Kubernetes platform scaled across multiple environments and regions, DNS management started becoming a bottleneck. What initially worked as a centralized setup began to introduce operational challenges such as API throttling and increasing manual effort.

In this article, I’ll walk through how we evolved our DNS architecture using ExternalDNS with AWS Route 53 and the improvements we achieved.

What is ExternalDNS?

ExternalDNS is a Kubernetes controller that automatically manages DNS records based on Kubernetes resources such as Ingress, Services, and custom resources like VirtualServer or DNSEndpoint.

It continuously watches the cluster and ensures that DNS records in providers like AWS Route 53 are kept in sync with the desired state defined in Kubernetes.

In simple terms, it acts as a bridge between Kubernetes and your DNS provider, allowing DNS to be managed declaratively through Kubernetes instead of manual updates.

The challenge: centralized DNS doesn’t scale

Initially, all DNS records were managed within a single AWS Route 53 account. Over time, this led to several issues:

  • API rate limiting (5 requests per second per account)
  • Frequent throttling during automated updates
  • Heavy reliance on manual DNS changes and tickets
  • Accumulation of stale or unused DNS records
  • Tight coupling between infrastructure and DNS operations

Although a temporary increase to 20 RPS was provided, it was not a sustainable long-term solution.

The solution: distributed DNS with ExternalDNS

To address these challenges, we redesigned our DNS architecture with the following approach.

Environment-based DNS distribution

We separated DNS management across AWS accounts based on environments (Dev, Stage, Prod). This reduced contention and improved scalability.

Kubernetes-driven DNS using ExternalDNS

We deployed ExternalDNS across clusters with the following configuration:

--policy=sync
--registry=txt
--txt-prefix=edns.
--txt-owner-id=<unique-per-cluster>
Enter fullscreen mode Exit fullscreen mode

Key capabilities

Automated DNS lifecycle

DNS records are now automatically:

  • Created when resources are deployed
  • Updated when configurations change
  • Deleted when resources are removed

This removes the need for manual DNS management and ensures DNS always reflects the actual cluster state.

Self-service DNS for application teams

Application teams can now manage DNS directly through Kubernetes:

  • Deploying an Ingress or VirtualServer creates DNS records
  • Deleting the resource removes the corresponding DNS

This significantly reduces dependency on infrastructure teams and speeds up delivery.

Safe multi-cluster ownership

By using a unique txt-owner-id per cluster:

  • Each cluster manages only its own records
  • Prevents accidental deletion or modification across clusters

This is especially important when the same domain is shared across multiple regions (for example, dev-east and dev-west).

Automatic cleanup

With --policy=sync, ExternalDNS ensures:

  • No orphaned DNS records remain
  • DNS always reflects the current state of the cluster

Avoiding API rate limits

Distributing DNS across multiple AWS accounts:

  • Reduces load per account
  • Eliminates throttling issues

This directly addresses the limitations of the previous centralized setup.

Seamless cluster migration

This architecture simplifies cluster migrations:

  • Deploy resources in a new cluster → DNS records are created
  • Remove resources from the old cluster → DNS records are deleted

DNS effectively follows the application without requiring manual coordination.

Real impact

After implementing this model, we observed:

  • Faster and smoother deployments
  • Elimination of manual DNS tickets
  • Cleaner and more consistent DNS state
  • Safer multi-cluster operations
  • A scalable architecture ready for future growth

Key takeaways

  • Centralized DNS becomes a bottleneck at scale
  • ExternalDNS with sync policy enables a declarative DNS model
  • Ownership isolation is critical in multi-cluster environments
  • Distributed DNS architecture improves both performance and reliability

What’s next

This foundation also enables:

  • Easier migration to managed Kubernetes platforms (EKS, AKS, GKE)
  • Standardized DNS management across environments
  • Improved observability and governance

If you are running Kubernetes at scale and still relying on centralized or manual DNS processes, this approach is worth exploring.

References

Top comments (0)