DEV Community

Cover image for Build your Service Mesh: The Proxy
Ramón Berrutti
Ramón Berrutti

Posted on • Updated on

Build your Service Mesh: The Proxy

DIY Service Mesh

This is a Do-It-Yourself Service Mesh, a simple tutorial for understanding
the internals of a service mesh. This project aims to provide a simple,
easy-to-understand reference implementation of a service mesh, which can be used
to learn about the various concepts and technologies used by a service mesh like Linkerd.

What are you going to learn?

  • Build a simple proxy and add service mesh features to it.
  • Use Netfilter to intercept and modify network packets.
  • Create a simple control plane to manage the service mesh.
  • Use gRPC to communicate between the proxy and the control plane.
  • Create an Admission Controller to validate and mutate Kubernetes resources.
  • Certificate generation flow and mTLS between the services.
  • How HTTP/2 works and how to use it with gRPC to balance the traffic between the services.
  • Add useful features like circuit breaking, retries, timeouts, and load balancing.
  • Add metrics and tracing to the service mesh with OpenTelemetry.
  • Canary deployments.

Some considerations

  • Only for learning propose, not a production-ready service mesh.
  • The proxy will print many logs to understand what is happening.
  • Use IPTables instead of Nftables for simplicity.
  • Keep the code as simple as possible to make it easy to understand.
  • Some Golang errors are ignored for simplicity.
  • Everything will be in the same repository to make it easier to understand the project.

What is going to be built?

The following components are going to be built:

  • proxy-init: Configure the network namespace of the pod.
  • proxy: This is the data plane of the service mesh, which is responsible for intercepting and modifying network packets.
  • controller: This is the control plane of the service mesh, which is responsible for configuring the data plane.
  • injector: This is an Admission Controller for Kubernetes, which mutates each pod that needs to be meshed.
  • samples apps: Four simple applications will communicate with each other. (http-client, http-server, grpc-client, grpc-server)

Tools and Running the project

  • kind to create a Kubernetes cluster locally.
  • Tilt to run the project and watch for changes.
  • Buf to lint and generate the Protobuf/gRPC code.
  • Docker to build the Docker images.
  • k9s to interact with the Kubernetes cluster. (Optional)

To start all the components, run the following command:

kind create cluster
tilt up
Enter fullscreen mode Exit fullscreen mode

Tilt will build all the images and deploy all the components to the Kubernetes cluster.
All the images are created by the Dockerfile in the root directory.

The main branch contains the WIP version of the project.

Architecture

The architecture of the service mesh is composed of the following components:

Architecture

Creating the HTTP applications

  • http-client: This application is going to be called the http-server service.
  • http-server: This application will be called by the http-client service.

In the next steps, grpc-client and grpc-server will be created.

http-client:

func main() {
    ctx, cancel := signal.NotifyContext(context.Background(), syscall.SIGINT, syscall.SIGTERM)
    defer cancel()
    n := 0

    endpoint := os.Getenv("ENDPOINT")
    if endpoint == "" {
        endpoint = "http://http-server.http-server.svc.cluster.local./hello"
    }

    httpClient := &http.Client{
        Timeout: 5 * time.Second,
    }

    // This application will call the endpoint every second
    ticker := time.NewTicker(time.Second)
    for {
        select {
        case <-ctx.Done():
            return
        case <-ticker.C:
            req, err := http.NewRequestWithContext(ctx, http.MethodGet, endpoint, nil)
            if err != nil {
                panic(err)
            }

            resp, err := httpClient.Do(req)
            if err != nil {
                panic(err)
            }

            dump, err := httputil.DumpResponse(resp, true)
            if err != nil {
                panic(err)
            }
            resp.Body.Close()

            n++
            fmt.Printf("Response #%d\n", n)
            fmt.Println(string(dump))
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

http-server:

func main() {
    ctx, cancel := signal.NotifyContext(context.Background(), syscall.SIGINT, syscall.SIGTERM)
    defer cancel()

    failRate, _ := strconv.Atoi(os.Getenv("FAIL_RATE"))

    n := uint64(0)
    hostname := os.Getenv("HOSTNAME")
    version := os.Getenv("VERSION")

    var b bytes.Buffer
    b.WriteString("Hello from the http-server service! Hostname: ")
    b.WriteString(hostname)
    b.WriteString(" Version: ")
    b.WriteString(version)

    mux := http.NewServeMux()
    mux.HandleFunc("/hello", func(w http.ResponseWriter, r *http.Request) {
        // Dump the request
        dump, _ := httputil.DumpRequest(r, true)
        fmt.Printf("Request #%d\n", atomic.AddUint64(&n, 1))
        fmt.Println(string(dump))
        fmt.Println("---")

        // Simulate failure
        if failRate > 0 {
            // Get a random number between 0 and 100
            n := rand.Intn(100)
            if n < failRate {
                http.Error(w, "Internal server error", http.StatusInternalServerError)
                fmt.Println("Failed to process request")
                return
            }
        }

        w.Header().Set("Content-Type", "text/plain")
        w.WriteHeader(http.StatusOK)
        w.Write(b.Bytes())
    })

    server := &http.Server{
        Addr:    ":8080",
        Handler: mux,
    }

    g, ctx := errgroup.WithContext(ctx)
    g.Go(func() error {
        <-ctx.Done()
        return server.Shutdown(context.Background())
    })

    g.Go(func() error {
        return server.ListenAndServe()
    })

    if err := g.Wait(); err != nil {
        if err != http.ErrServerClosed {
            panic(err)
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

The http-server service is going to respond with a message that contains the
hostname and the version of the service.
Failures will be simulated by setting the FAIL_RATE environment variable.

Each service is going to be deployed in a different namespace:

  • http-client: http-client Deployment in the http-client namespace: http-client.yaml
  • http-server: http-server Deployment in the http-server namespace: http-server.yaml

Testing the service mesh

Logs for http-client and http-server pods to see the communication between the services.

http-client logs:

Response #311
HTTP/1.1 200 OK
Content-Length: 71
Content-Type: text/plain
Date: Sat, 08 Jun 2024 19:38:27 GMT

Hello from http-server service! Hostname: http-server-799c77dc9b-56lmd Version: 1.0
Enter fullscreen mode Exit fullscreen mode

http-server logs:

Request #171
GET /hello HTTP/1.1
Host: http-server.http-server.svc.cluster.local.
Accept-Encoding: gzip
User-Agent: Go-http-client/1.1
Enter fullscreen mode Exit fullscreen mode

Implementing the proxy to intercept the HTTP/1.1 requests and responses.

Why need a proxy?

The proxy will intercept all of the inbound and outbound traffic of the services. (except explicitly ignored)
Linkerd has a Rust based proxy called linkerd2-proxy, and Istio has a C++ based proxy called Envoy,
which is a very powerful proxy with a lot of features.

The proxy code is going to be very simple and will be similar to linkerd2-proxy functionalities.

For now, the proxy will listen on two ports: one for inbound traffic and another for outbound traffic.

  • 4000 for the inbound traffic.
  • 5000 for the outbound traffic.

This is a basic proxy implementation that intercepts HTTP requests and HTTP responses.

func main() {
    ctx, cancel := signal.NotifyContext(context.Background(), os.Interrupt)
    defer cancel()

    g, ctx := errgroup.WithContext(ctx)
    // Inbound connection
    g.Go(func() error {
        return listen(ctx, ":4000", handleInboundConnection)
    })

    // Outbound connection
    g.Go(func() error {
        return listen(ctx, ":5000", handleOutboundConnection)
    })

    if err := g.Wait(); err != nil {
        panic(err)
    }
}

func listen(ctx context.Context, addr string, accept func(net.Conn)) error {
    l, err := net.Listen("tcp", addr)
    if err != nil {
        return fmt.Errorf("failed to listen: %w", err)
    }
    defer l.Close()
    go func() {
        <-ctx.Done()
        l.Close()
    }()

    for {
        conn, err := l.Accept()
        if err != nil {
            return fmt.Errorf("failed to accept: %w", err)
        }

        go accept(conn)
    }
}
Enter fullscreen mode Exit fullscreen mode

The listen function listens on the port and calls the accept function when a connection is established.

handleInboundConnection

All the inbound traffic is going to be intercepted by the proxy, print the request, forward the request
to the local destination port and print the response.

func handleInboundConnection(c net.Conn) {
    defer c.Close()

    // Get the original destination
    _, port, err := getOriginalDestination(c)
    if err != nil {
        fmt.Printf("Failed to get original destination: %v\n", err)
        return
    }

    fmt.Printf("Inbound connection from %s to port: %d\n", c.RemoteAddr(), port)

    // Read the request
    req, err := http.ReadRequest(bufio.NewReader(c))
    if err != nil {
        fmt.Printf("Failed to read request: %v\n", err)
        return
    }

    // Call the local service port
    upstream, err := net.Dial("tcp", fmt.Sprintf("127.0.0.1:%d", port))
    if err != nil {
        fmt.Printf("Failed to dial: %v\n", err)
        return
    }
    defer upstream.Close()

    // Write the request
    if err := req.Write(io.MultiWriter(upstream, os.Stdout)); err != nil {
        fmt.Printf("Failed to write request: %v\n", err)
        return
    }

    // Read the response
    resp, err := http.ReadResponse(bufio.NewReader(upstream), req)
    if err != nil {
        fmt.Printf("Failed to read response: %v\n", err)
        return
    }
    defer resp.Body.Close()

    // Write the response
    if err := resp.Write(io.MultiWriter(c, os.Stdout)); err != nil {
        fmt.Printf("Failed to write response: %v\n", err)
        return
    }

    // Add a newline for better readability
    fmt.Println()
}
Enter fullscreen mode Exit fullscreen mode

The handleInboundConnection function first reads the destination port to our service,
iptables will set the destination port using the SO_ORIGINAL_DST socket option.
The function getOriginalDestination returns the original destination of the TCP connection,
check the code to see how it works. (This is a Linux specific feature)

After that, read the request, forward the request to the local service port, read
the response and send it back to the client.

For visibility, print the request and response using io.MultiWriter to write to the connection and stdout.

handleOutboundConnection

The outbound look very similar to the inbound, but forward the request to the target service.

func handleOutboundConnection(c net.Conn) {
    defer c.Close()

    // Get the original destination
    ip, port, err := getOriginalDestination(c)
    if err != nil {
        fmt.Printf("Failed to get original destination: %v\n", err)
        return
    }

    fmt.Printf("Outbound connection to %s:%d\n", ip, port)

    // Read the request
    req, err := http.ReadRequest(bufio.NewReader(c))
    if err != nil {
        fmt.Printf("Failed to read request: %v\n", err)
        return
    }

    // Call the external service ip:port
    upstream, err := net.Dial("tcp", fmt.Sprintf("%s:%d", ip, port))
    if err != nil {
        fmt.Printf("Failed to dial: %v\n", err)
        return
    }
    defer upstream.Close()

    // Write the request
    if err := req.Write(io.MultiWriter(upstream, os.Stdout)); err != nil {
        fmt.Printf("Failed to write request: %v\n", err)
        return
    }

    // Read the response
    resp, err := http.ReadResponse(bufio.NewReader(upstream), req)
    if err != nil {
        fmt.Printf("Failed to read response: %v\n", err)
        return
    }
    defer resp.Body.Close()

    // Write the response
    if err := resp.Write(io.MultiWriter(c, os.Stdout)); err != nil {
        fmt.Printf("Failed to write response: %v\n", err)
        return
    }

    // Add a newline for better readability
    fmt.Println()
}
Enter fullscreen mode Exit fullscreen mode

As can be seen, the only difference is in how the external service is called.

    // Call the external service ip:port
    upstream, err := net.Dial("tcp", fmt.Sprintf("%s:%d", ip, port))
    if err != nil {
        fmt.Printf("Failed to dial: %v\n", err)
        return
    }
    defer upstream.Close()
Enter fullscreen mode Exit fullscreen mode

It is important to note that the service resolves the DNS, so only the IP and the port need to be provided.

How are the connections intercepted?

Kubernetes Pod Networking Understanding

Each kubernetes pod shares the same network between the containers, so the localhost is the same for all the containers in the pod.

initContainers

Kubernetes has a feature called initContainers, which is a container that runs before the main containers starts. These containers need to finish before the main containers starts.

iptables

The iptables is a powerful tool to manage Netfilter in Linux, it can be used to intercept and modify the network packets.

Before our http-client and http-server containers starts, the proxy-init is going to configure the Netfilter to redirect
all the traffic to the proxy inbounds and outbounds ports.

func main() {
    // Configure the proxy
    commands := []*exec.Cmd{
        // Default accept for all nat chains
        exec.Command("iptables", "-t", "nat", "-P", "PREROUTING", "ACCEPT"),
        exec.Command("iptables", "-t", "nat", "-P", "INPUT", "ACCEPT"),
        exec.Command("iptables", "-t", "nat", "-P", "OUTPUT", "ACCEPT"),
        exec.Command("iptables", "-t", "nat", "-P", "POSTROUTING", "ACCEPT"),

        // Create custom chains so is possible jump to them
        exec.Command("iptables", "-t", "nat", "-N", "PROXY_INBOUND"),
        exec.Command("iptables", "-t", "nat", "-N", "PROXY_OUTBOUND"),

        // Jump to custom chains, if something is not matched, will return to the default chains.
        exec.Command("iptables", "-t", "nat", "-A", "PREROUTING", "-p", "tcp", "-j", "PROXY_INBOUND"),
        exec.Command("iptables", "-t", "nat", "-A", "OUTPUT", "-p", "tcp", "-j", "PROXY_OUTBOUND"),

        // Set rules for custom chains: PROXY_INBOUND, redirect all inbound traffic to port 4000
        exec.Command("iptables", "-t", "nat", "-A", "PROXY_INBOUND", "-p", "tcp", "-j", "REDIRECT", "--to-port", "4000"),

        // Set rules for custom chains: PROXY_OUTBOUND
        // Ignore traffic between the containers.
        exec.Command("iptables", "-t", "nat", "-A", "PROXY_OUTBOUND", "-o", "lo", "-j", "RETURN"),
        exec.Command("iptables", "-t", "nat", "-A", "PROXY_OUTBOUND", "-d", "127.0.0.1/32", "-j", "RETURN"),

        // Ignore outbound traffic from the proxy container.
        exec.Command("iptables", "-t", "nat", "-A", "PROXY_OUTBOUND", "-m", "owner", "--uid-owner", "1337", "-j", "RETURN"),

        // Redirect all the outbound traffic to port 5000
        exec.Command("iptables", "-t", "nat", "-A", "PROXY_OUTBOUND", "-p", "tcp", "-j", "REDIRECT", "--to-port", "5000"),
    }

    for _, cmd := range commands {
        if err := cmd.Run(); err != nil {
            panic(fmt.Sprintf("failed to run command %s: %v\n", cmd.String(), err))
        }
    }

    fmt.Println("Proxy initialized successfully!")
}
Enter fullscreen mode Exit fullscreen mode

Some important points:

  • To allow outbound traffic from the proxy container, the iptables option --uid-owner is used.
  • The use of custom chains is to make it easier to understand the rules and allow to return to the default chains if the rule is not matched.
  • The REDIRECT option is used to redirect the traffic to the proxy and is the responsable to add the SO_ORIGINAL_DST information to the socket.

Adding the proxy and proxy-init containers to the deployments:

    spec:
      initContainers:
      - name: proxy-init
        image: diy-sm-proxy-init
        imagePullPolicy: IfNotPresent
        securityContext:
          capabilities:
            add:
              - NET_ADMIN
              - NET_RAW
            drop:
              - ALL
      containers:
      - name: proxy
        image: diy-sm-proxy
        imagePullPolicy: IfNotPresent
        securityContext:
          runAsUser: 1337
      - name: http-client
        image: diy-sm-http-client
        imagePullPolicy: IfNotPresent
Enter fullscreen mode Exit fullscreen mode

The same configuration is going to be applied to the http-server deployment.

Some important points:

  • proxy-init run as init container that is going to configure the network namespace of the pod and exit.
  • NET_ADMIN and NET_RAW are linux capabilities that are necessary to configure the Netfilter, without these capabilities iptables can't call the system calls to configure the Netfilter.
  • Using the runAsUser: 1337 in the proxy container is very important so the proxy traffic to outside of the pod is allowed.

Logs output for the proxy and the applications

http-server logs:

proxy Inbound connection from 10.244.0.115:60296 to port: 8080
http-server Request #13
http-server GET /hello HTTP/1.1
http-server Host: http-server.http-server.svc.cluster.local.
http-server Accept-Encoding: gzip
http-server User-Agent: Go-http-client/1.1
proxy GET /hello HTTP/1.1
proxy Host: http-server.http-server.svc.cluster.local.
proxy User-Agent: Go-http-client/1.1
proxy Accept-Encoding: gzip
proxy
proxy HTTP/1.1 200 OK
proxy Content-Length: 86
proxy Content-Type: text/plain
proxy Date: Mon, 17 Jun 2024 20:58:46 GMT
proxy
proxy Hello from the http-server service! Hostname: http-server-c6f4776bb-mmw2d Version: 1.0
Enter fullscreen mode Exit fullscreen mode

http-client logs:

proxy Outbound connection to 10.96.182.169:80
proxy GET /hello HTTP/1.1
proxy Host: http-server.http-server.svc.cluster.local.
proxy User-Agent: Go-http-client/1.1
proxy Accept-Encoding: gzip
proxy
proxy HTTP/1.1 200 OK
proxy Content-Length: 86
proxy Content-Type: text/plain
proxy Date: Mon, 17 Jun 2024 21:04:53 GMT
proxy
proxy Hello from the http-server service! Hostname: http-server-c6f4776bb-slpdf Version: 1.0
http-client Response #16
http-client HTTP/1.1 200 OK
http-client Content-Length: 86
http-client Content-Type: text/plain
http-client Date: Mon, 17 Jun 2024 21:04:53 GMT
http-client
http-client Hello from the http-server service! Hostname: http-server-c6f4776bb-slpdf Version: 1.0
Enter fullscreen mode Exit fullscreen mode

Adding manually the proxy-init and proxy containers?

Doing this is not so practical, right?

Kubernetes has a feature called Admission Controller, which is a webhook that can validate and mutate the objects before they are persisted in the etcd.
Learn more about it here.

But in the next steps, a Mutation Admission Controller will be created to inject the proxy-init and proxy containers into the pods that need to be meshed.

Next

For the next part, the focus will be to create the Admission Controller to inject the proxy-init and the proxy.

Contact and Github Project

The github project contains the WIP of the project.

Feel free to contact me: https://www.linkedin.com/in/ramonberrutti/

Top comments (0)