DEV Community

Nuno Pinho
Nuno Pinho

Posted on

Observability Stack and CI/CD Workflow: A Technical Breakdown 🛠

Hi everyone! 👋
I wanted to take a moment to share with you guys one of the projects that I've been working on recently. While I'm not an expert in creating system diagrams, I tried (and still am) to architect a modern microservices ecosystem.
I must admit that I'm not typically a frequent sharer, but I've been inspired to be more active with it, therefore I'll share some interesting insights as well as numerous troubleshooting steps that can not only be interesting but also helpful to others who may be facing similar challenges. So with no more further ado, this diagram represents an observability infrastructure stack and a CI/CD workflow(ish):

alt image

  1. Every time there's a new feature to be released, it's submitted a pull request against the base branch, once the PR is approved, the feature branch is merged to the base branch. 🚀 VPS Continuous Integration (CI) workflow starts 🏗

  2. GitHub Action CI workflow starts and runs unit tests, and container image building. In this particular case, the image is being deployed with a tag generated with the build id. It runs other actions related to the build and deployment. Once completed, a new image version (let's call it vps-1.x) is pushed to a private container. 🚀 VPS Continuous Integration (CI) workflow end and Continuous Delivery (CD) workflow starts 🚚

  3. The image is deployed inside a Single Node Cluster orchestrated with the help of Kubernetes, some services may use a Kubernetes yaml template, others a helm template (upon need).

  4. Every time a new service is deployed, it needs to be associated with the API gateway, and an ingress needs to be created. The ingress is going to define a configuration for routing external traffic to specific services in the cluster. In this particular case, it's responsible for routing external https traffic for a determined domain to the Kubernetes service and it handles load balancing. It also manages SSL certificates for secure connections and handles SSL termination at the target service, basically being a reverse proxy. 🚀 VPS Continuous Delivery (CD) workflow end 🏁

  5. When someone is trying to access a specific resource via API Gateway, that process is forwarded to the ingress, and then the ingress calls the corresponding microservice. Each microservice communicates with the auth service using OIDC, verifying tokens and their access before processing requests. Even though the API Gateway is already performing authentication, some of the microservices are also doing so to prevent any point of failure.

  6. There's an efficient data replication mechanism powered by a scheduled cron job. This job periodically creates snapshots of the internal database systems and securely replicates them to an external database. This ensures data redundancy, high availability, and disaster recovery preparedness.

  7. Depending on the main branch GitHub action runs, it's going to perform a tailor custom image build. In this particular case, for cloud deployments, a distinct image (let's call it aws-v1.x) is crafted to meet the specific requirements of a cloud function deployment. Not only is the aws-1.x image pushed to a private Elastic Container Registry (ECR), but it's also diligently backed up in another private container, ensuring data and code integrity throughout the deployment process. 🚀 AWS Cloud Continuous Integration (CI) workflow starts ☁️

  8. The security measures extend to the AWS ecosystem. Amazon ECR employs AWS Identity and Access Management (IAM) permissions to validate whether the user or third-party attempting to access the image has the necessary authority and permissions, ensuring that only authorized entities can interact with the image, whether for reading or writing data. 🚀 AWS Cloud Continuous Integration (CI) workflow end and AWS Cloud Continuous Delivery (CD) workflow starts 🚀

  9. AWS Cloud orchestrates services across multiple Availability Zones, in each of these distinct zones a network of services is at play. In this case, AWS Lambdas and AWS API Gateway establish connection with IAM roles thoughtfully configured with permission policies that align with the unique requirements of the services they support.

  10. Thanks to prior job runs using GitHub Actions, every AWS Lambda function, under the influence of the deployed image, possesses an inherent capability for autonomous version updates. Furthermore, I'm mindful of the common cold start challenge that some of the deployed Lambdas might encounter (and some have already started), particularly after periods of inactivity. Nevertheless, I'm already developing additional configurations to introduce snap-start capabilities, that I will eventually discuss in another post. 🚀 AWS Cloud Continuous Delivery (CD) workflow ends 🌐

  11. Unlike AWS Lambdas, where updates are necessary when adding or removing resources within microservices, API Gateway operates differently. Once created and associated with a Lambda, it retains its adaptability. This happens because it relies on the presence of a proxy resource within the API Gateway REST API. This eliminates the need for frequent updates to the API Gateway or the creation of new functions every time a new resource is introduced.

  12. Currently, the AWS Lambdas and API Gateway channel their logs to CloudWatch, serving as the primary tool for log management and insights. I've already started the transition to integrate Datadog, but I want to highlight a distinctive challenge regarding billing logs. I'm engaged in seeking solutions to facilitate a smooth transition, while ensuring that I maintain access to the billing data, as I want to maximize the use of the AWS Cloud Free Tier while avoiding overnight astronomical costs.

🚧 Disclaimer: The majority of this is implemented, but there are some features/aspects that are still under construction, so to speak. 🚧

I'm always open to discussions about this project, its architecture, or even sharing experiences on similar projects. Feel free to connect or send me a message if you'd like to discuss any aspect or if you have any questions. Let's keep the conversation going! 🔗

Top comments (0)