DEV Community

loading...

Pod-to-pod network delays in AKS

deyanp profile image deyanp ・3 min read

For the past few months I have been observing intermittent network delays in the internal http communication between pods in our AKS clusters. But what does "intermittent" and "slow" mean? A picture is worth a thousand words, so here we go:

Alt Text

Alt Text

You will quickly notice that there are some outliers, ranging from 200-300ms to 2 seconds!

Upon closer inspection one of the out-of-process pod-to-pod calls is taking too long:

Alt Text

Now, the reasons can be manifold:

  1. Sender issue (delay in sending)
  2. Network issue (delay in transmitting)
  3. Recipient issue (delay in receiving/processing)

From the last screenshot above it seems that recipient was very fast at processing the response, and the time is lost somewhere in between.

Even though I had the feeling that would not help too much I opened an MS Support Ticket for checking for any network issues in our AKS clusters ... MS Support agent advised to sniff the network traffic, and this was a really good hint (I had to brush up my Wireshark knowledge though).

It turns out it is quite easy to sniff network traffic between pods in a K8s cluster (provided you have admin access to it). MS advised to use ksniff but there are also easier ways I googled only later on. So with ksniff you just need to:

  1. Install krew (plugin installer for kubectl) - https://krew.sigs.k8s.io/docs/user-guide/setup/install/

  2. Install ksniff - https://github.com/eldadru/ksniff#installation

  3. Install Wireshark/tshark - here I had a problem that default Ubuntu repositories contain an old version of Wireshark/tshark, so this github issue comment helped me

  4. Sniff

kubectl sniff POD-NAME -n POD-NAMESPACE-NAME -p
Enter fullscreen mode Exit fullscreen mode

(argument -p is important otherwise I was getting " ... cannot access '/tmp/static-tcpdump': No such file or directory" error)

... and voila, Wireshark opens automatically and starts getting network traffic in. I put http in the filter so that I can see only the http traffic and started waiting for another occurrence of the problem ... which took 7-8 hours. This is what I got (317 ms instead of 2 seconds this time, but it varies):

Alt Text

However the request which was taking 300+ ms in the Application Insights screenshots was taking only 1 ms in this trace ... and the strange thing is that it was starting much later - 300 ms later than what I saw in Application Insights ...

Removing the http filter in Wireshark showed some interesting DNS communication, with the request-response marked in red taking 300ms!

Alt Text

It turns out we are using target service (Kubernetes service, behind which there are 1 or more pods) hostnames in our calling pod configuration like this:

http://xxxxxxxx.default.svc.cluster.local/api/...

and somehow AKS or K8s is trying multiple DNS lookups by appending again .default.svc.cluster.local (or parts of it), until at the end it tries to lookup the original hostname which is of course found immediately ... and one of these DNS lookups takes longer from time to time.

Solution (at least for now, until MS advise what could be a better one): Remove the suffix .default.svc.cluster.local for all target service hostnames in our calling pod configurations. The picture is different now:

Alt Text

Hope the above helps someone avoid this issue!

Discussion (2)

pic
Editor guide
Collapse
eldadru profile image
Eldad Rudich

Glad you liked ksniff!

Collapse
deyanp profile image
deyanp Author

Yeap, thanks for creating it!