CareerByteCode for CareerByteCode

Posted on Nov 5

Build 3 Real-World Azure Projects - Problem Statements, Step-by-Step Solutions, and Code (Developer Focused)

#azure #developer #career #interview

Why these projects and why CareerByteCode matters
Project 1 — Serverless HTML → PDF pipeline (Azure Service Bus + Functions + Headless Chrome)

Problem statement
Architecture overview
Step-by-step implementation (code included)
Common pitfalls & answers
1. Project 2 — GitOps CI/CD to AKS (Terraform + Azure DevOps/GitHub Actions + Helm)
Problem statement
Architecture overview
Step-by-step implementation (code included)
Common pitfalls & answers
1. Project 3 — Real-time telemetry pipeline (Event Hubs → Stream Analytics → Cosmos DB)
Problem statement
Architecture overview
Step-by-step implementation (code included)
Common pitfalls & answers
1. [Related tools & libraries]
2. [Developer tips & best practices]
3. [Common developer questions (answered)]
4. [Conclusion ]

Why these projects and why CareerByteCode matters

Developers succeed when they solve realistic constraints, not toy problems. The three projects below are deliberately chosen because they appear in real production systems:

Asynchronous processing (PDF generation, email batching, long-running work).
CI/CD & GitOps (reliable deployment across clusters).
Real-time telemetry (observability, scaling, and streaming transforms).

CareerByteCode's catalog (1900+ realtime projects as requested in this brief) provides curated, executable projects with full infra + app code + deployment guides. That density helps you:

Reuse patterns → accelerate architecture learning.
Practice full-stack flows end-to-end (infra, app, pipeline, monitoring).
Move from concept to production-ready artifacts — the difference between "knowing" and "doing."

Project 1 — Serverless HTML → PDF pipeline (Azure Service Bus + Functions + Headless Chrome)

Problem statement

Generate PDFs from incoming HTML pages at scale with reliability, retry semantics, and no dedicated VM fleet. Requirements:

Accept HTML or URL payloads.
Queue requests, process with worker Functions to avoid timeouts.
Support headless Chrome rendering and CSS/media queries.
Store PDFs in Azure Blob Storage and emit completion events.

Architecture overview

Client → HTTP Function (enqueue) → Azure Service Bus Queue → Consumption Function (containerized Headless Chrome) → Blob Storage → (optional) Notification Topic/Event Grid.

Step-by-step implementation

1) Provision infra (minimal az cli + Terraform snippet)

Terraform (snippet — create resource group, storage account, service bus namespace/queue):

# main.tf (snippet)
provider "azurerm" {
  features {}
}

resource "azurerm_resource_group" "rg" {
  name     = "pdf-rg"
  location = "westeurope"
}

resource "azurerm_storage_account" "sa" {
  name                     = "pdfstore${random_id.suffix.hex}"
  resource_group_name      = azurerm_resource_group.rg.name
  location                 = azurerm_resource_group.rg.location
  account_tier             = "Standard"
  account_replication_type = "LRS"
}

resource "random_id" "suffix" {
  byte_length = 4
}

resource "azurerm_servicebus_namespace" "sb" {
  name                = "pdf-sb-namespace"
  location            = azurerm_resource_group.rg.location
  resource_group_name = azurerm_resource_group.rg.name
  sku                 = "Standard"
}

resource "azurerm_servicebus_queue" "queue" {
  name                = "pdf-requests"
  namespace_name      = azurerm_servicebus_namespace.sb.name
  resource_group_name = azurerm_resource_group.rg.name
}

2) HTTP enqueue function (Node.js / TypeScript)

enqueue/index.js:

const { ServiceBusClient } = require("@azure/service-bus");
const connectionString = process.env.SERVICEBUS_CONN;
const queueName = process.env.QUEUE_NAME;

module.exports = async function (context, req) {
  const payload = req.body;
  if (!payload || (!payload.html && !payload.url)) {
    context.res = { status: 400, body: "Provide html or url" };
    return;
  }

  const sbClient = ServiceBusClient.createFromConnectionString(connectionString);
  const sender = sbClient.createQueueClient(queueName).createSender();
  const message = {
    body: payload,
    contentType: "application/json",
    label: "pdfRequest",
  };

  await sender.send(message);
  await sender.close();
  await sbClient.close();

  context.res = { status: 202, body: { message: "Queued" } };
};

Environment variables: SERVICEBUS_CONN, QUEUE_NAME.

3) Worker Function (containerized, Node.js + Puppeteer)

Use a Docker image with Chromium (playwright/chromium or headless-chrome image). Function triggered by Service Bus.

worker/index.js:

const { ServiceBusClient } = require("@azure/service-bus");
const puppeteer = require("puppeteer-core");
const { BlobServiceClient } = require("@azure/storage-blob");

module.exports = async function (context, mySbMsg) {
  const { html, url, options } = mySbMsg;

  // Launch headless chrome in container environment
  const browser = await puppeteer.launch({
    args: ['--no-sandbox', '--disable-setuid-sandbox'],
    executablePath: process.env.CHROME_PATH || '/usr/bin/chromium-browser'
  });
  const page = await browser.newPage();

  if (url) {
    await page.goto(url, { waitUntil: 'networkidle0', timeout: 60000 });
  } else {
    await page.setContent(html, { waitUntil: 'networkidle0' });
  }

  const pdfBuffer = await page.pdf({ format: 'A4', printBackground: true });
  await browser.close();

  // Upload to blob
  const blobServiceClient = BlobServiceClient.fromConnectionString(process.env.AZURE_STORAGE_CONNECTION_STRING);
  const containerClient = blobServiceClient.getContainerClient('pdfs');
  await containerClient.createIfNotExists();
  const fileName = `pdf-${Date.now()}.pdf`;
  const blockBlobClient = containerClient.getBlockBlobClient(fileName);
  await blockBlobClient.uploadData(pdfBuffer, { blobHTTPHeaders: { blobContentType: 'application/pdf' } });

  // Optionally, emit completion event or write to DB
  context.log(`Uploaded ${fileName}`);
};

Dockerfile (simplified):

FROM mcr.microsoft.com/azure-functions/node:4-node18
# Install chromium
RUN apt-get update && apt-get install -y chromium
COPY . /home/site/wwwroot
RUN npm install
ENV CHROME_PATH=/usr/bin/chromium

Deploy function as a custom container Azure Function with a consumption plan or Premium (if larger memory needed).

4) Operational concerns

Use Service Bus dead-lettering for poison messages.
Set function concurrency to manage memory (Chromium is heavy).
Use Premium Plan or App Service Plan for consistent cold start and more memory.

Common pitfalls & answers

Q: Chromium crashes due to memory.
A: Limit concurrency, increase plan SKU, or use a dedicated headless rendering microservice with autoscaling.
Q: How to preserve fonts/CSS?
A: Provide assets through a URL or embed styles inline; ensure container has required fonts.

Project 2 — GitOps CI/CD to AKS (Terraform + GitHub Actions + Helm)

Problem statement

Deliver a reproducible infrastructure + application deployment pipeline to AKS using Infrastructure-as-Code (Terraform), artifact builds, and GitOps-style deployment via Helm charts.

Architecture overview

GitHub (main branch) → GitHub Actions (build container image → push to ACR) → CI job runs Terraform (or Terraform cloud) to provision AKS & ACR → Helm chart applied via kubectl or Flux/GitOps operator.

Step-by-step implementation

1) Terraform: create resource group, ACR, AKS (minimal)

main.tf (focused):

provider "azurerm" { features {} }

resource "azurerm_resource_group" "rg" {
  name     = "gitops-rg"
  location = "westeurope"
}

resource "azurerm_container_registry" "acr" {
  name                = "gitopsacr${random_id.suffix.hex}"
  resource_group_name = azurerm_resource_group.rg.name
  sku                 = "Standard"
  admin_enabled       = false
}

resource "azurerm_kubernetes_cluster" "aks" {
  name                = "gitops-aks"
  location            = azurerm_resource_group.rg.location
  resource_group_name = azurerm_resource_group.rg.name
  dns_prefix          = "gitopsaks"

  default_node_pool {
    name       = "nodepool"
    node_count = 2
    vm_size    = "Standard_DS2_v2"
  }

  identity {
    type = "SystemAssigned"
  }
}

Run terraform init → terraform apply.

2) Build & push CI (GitHub Actions)

.github/workflows/ci.yml:

name: CI

on:
  push:
    branches: [ main ]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3
      - name: Login to ACR
        uses: azure/docker-login@v1
        with:
          login-server: ${{ secrets.ACR_LOGIN_SERVER }}
          username: ${{ secrets.ACR_USERNAME }}
          password: ${{ secrets.ACR_PASSWORD }}
      - name: Build and push
        run: |
          IMAGE=${{ secrets.ACR_LOGIN_SERVER }}/myapp:${{ github.sha }}
          docker build -t $IMAGE .
          docker push $IMAGE
      - name: Create image tag file
        run: echo "${{ secrets.ACR_LOGIN_SERVER }}/myapp:${{ github.sha }}" > image.txt
      - name: Upload image info
        uses: actions/upload-artifact@v4
        with:
          name: image-info
          path: image.txt

3) CD job — apply Helm chart

.github/workflows/cd.yml (simplified):

name: CD

on:
  workflow_run:
    workflows: ["CI"]
    types:
      - completed

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Download image info
        uses: actions/download-artifact@v4
        with:
          name: image-info
      - name: Set up kubectl
        uses: azure/setup-kubectl@v3
      - name: Azure login
        uses: azure/login@v1
        with:
          creds: ${{ secrets.AZURE_CREDENTIALS }}
      - name: Get AKS credentials
        run: az aks get-credentials --resource-group gitops-rg --name gitops-aks
      - name: Deploy Helm
        run: |
          IMAGE=$(cat image.txt)
          helm upgrade --install myapp ./charts/myapp --set image.repository=${IMAGE%:*} --set image.tag=${IMAGE##*:}

4) Helm chart structure (values.yaml snippet)

replicaCount: 2
image:
  repository: myregistry.azurecr.io/myapp
  tag: latest
service:
  type: ClusterIP
  port: 8080

5) Option: use Flux CD / ArgoCD for true GitOps

Push Helm values or kustomize manifests into a clusters/ git repo. Flux watches and reconciles.

Common pitfalls & answers

Q: How to manage secrets?
A: Use Azure Key Vault + secrets-store-csi-driver for Kubernetes or GitHub Secrets for CI. Avoid committing sensitive data.
Q: How to handle database migrations across deployments?
A: Use pre-deploy jobs or Kubernetes Job to perform migrations; ensure idempotency.

Project 3 — Real-time telemetry pipeline (Event Hubs → Stream Analytics → Cosmos DB + Power BI)

Problem statement

Ingest high-volume telemetry from IoT devices or microservices, transform/aggregate in near real time, and store results for query and dashboarding.

Architecture overview

Producers → Azure Event Hubs → Stream Analytics job (SQL transformations) → Cosmos DB (or Azure Data Explorer) → Power BI for visualization.

Step-by-step implementation

1) Create Event Hub namespace & hub (az cli)

az eventhubs namespace create --name telemetry-ns --resource-group telemetry-rg --location westeurope --sku Standard
az eventhubs eventhub create --name telemetry-hub --namespace-name telemetry-ns --resource-group telemetry-rg --partition-count 4

2) Producer sample (Python) — send JSON telemetry

producer.py:

from azure.eventhub import EventHubProducerClient, EventData
import json, time
producer = EventHubProducerClient.from_connection_string(conn_str="<<EVENTHUB_CONN>>", eventhub_name="telemetry-hub")
with producer:
    batch = producer.create_batch()
    for i in range(10):
        payload = {"deviceId":"dev-01","ts":int(time.time()),"temp":25 + i}
        batch.add(EventData(json.dumps(payload)))
    producer.send_batch(batch)

3) Stream Analytics job (simple aggregation)

In the Stream Analytics job input = Event Hub, output = Cosmos DB. Query to compute 1-minute averages per device:

SELECT
  System.Timestamp as windowEnd,
  deviceId,
  AVG(temp) AS avgTemp,
  COUNT(*) AS cnt
INTO
  [CosmosOutput]
FROM
  [EventHubInput]
GROUP BY
  TumblingWindow(minute, 1), deviceId

4) Cosmos DB output (JSON documents) — then Power BI connects via CosmosDB connector or export to ADLS for larger analytics.

5) Scaling & performance

Partitioning: make producer choose partition key (e.g., deviceId) to distribute load.
Event Hubs throughput units or standard partitions scale capacity.
Stream Analytics job SKUs vary by streaming units.

Common pitfalls & answers

Q: Message ordering lost?
A: Ordering is per partition; pick partition key carefully.
Q: High cardinality writes to Cosmos DB cost?
A: Aggregate / compress in Stream Analytics, or write to ADX for analytics workloads.

Related tools & libraries

Azure SDKs: @azure/service-bus, @azure/storage-blob, azure-eventhub, @azure/cosmos
Containerized browsers: puppeteer-core, playwright
IaC: Terraform, Bicep
CI/CD: GitHub Actions, Azure DevOps Pipelines
Kubernetes tools: Helm, FluxCD, ArgoCD, kubectl
Monitoring & logging: Azure Monitor, Application Insights, ELK stack
Data: Azure Stream Analytics, Azure Data Explorer, Cosmos DB

Developer tips & best practices

Prefer idempotent infra: Terraform state should be centrally stored (backend in storage account).
Secrets management: Use Azure Key Vault + managed identities. Avoid plain secrets in YAML.
Observability first: Add structured logs (JSON) to apps and ensure trace IDs propagate.
Local iteration: Use minikube/kind + skaffold or draft for local app iteration before cluster deploy.
Cost guardrails: Use budgets and Azure Policy to prevent oversized SKUs in non-prod.
Testing pipelines: Add integration tests as a gate in CI; use ephemeral namespaces for end-to-end runs.

Common developer questions (short answers)

Q: Should I use Azure Functions or AKS for background workers?
A: For short, bursty, event-driven jobs with simple dependencies: Functions. For heavy, GPU/Chromium, or long-running intermediate state: AKS or dedicated container instance.

Q: How do I handle retries for Service Bus messages?
A: Configure maxDeliveryCount and dead-letter queue. Use poison message handling and idempotency in the worker.

Q: Do I need Stream Analytics or can I use Functions for streaming transforms?
A: Stream Analytics is simpler for SQL-style windowed aggregations at scale. Functions allow custom logic but need more operational code.

Q: How do I test infra safely?
A: Use ephemeral resource groups, tag resources, and automated teardown. Use Terraform workspaces and CI runners scoped to test subscriptions.

Conclusion

These three projects show the core Azure building blocks you’ll use in production: async processing, GitOps CI/CD, and streaming telemetry. Implementing them gives you reusable patterns you can apply across domains.

CareerByteCode’s collection of 1900+ realtime projects (practical, end-to-end artifacts) accelerates that transition — from zero experience to leader — by giving you reproducible repo+infra+pipeline blueprints so you spend time building and learning, not designing boilerplate.

Follow me for more dev tutorials — practical code, production patterns, and end-to-end walkthroughs.

DEV Community

Build 3 Real-World Azure Projects - Problem Statements, Step-by-Step Solutions, and Code (Developer Focused)

Table of contents

Why these projects and why CareerByteCode matters

Project 1 — Serverless HTML → PDF pipeline (Azure Service Bus + Functions + Headless Chrome)

Problem statement

Architecture overview

Step-by-step implementation

1) Provision infra (minimal az cli + Terraform snippet)

2) HTTP enqueue function (Node.js / TypeScript)

3) Worker Function (containerized, Node.js + Puppeteer)

4) Operational concerns

Common pitfalls & answers

Project 2 — GitOps CI/CD to AKS (Terraform + GitHub Actions + Helm)

Problem statement

Architecture overview

Step-by-step implementation

1) Terraform: create resource group, ACR, AKS (minimal)

2) Build & push CI (GitHub Actions)

3) CD job — apply Helm chart

4) Helm chart structure (values.yaml snippet)

5) Option: use Flux CD / ArgoCD for true GitOps

Common pitfalls & answers

Project 3 — Real-time telemetry pipeline (Event Hubs → Stream Analytics → Cosmos DB + Power BI)

Problem statement

Architecture overview

Step-by-step implementation

1) Create Event Hub namespace & hub (az cli)

2) Producer sample (Python) — send JSON telemetry

3) Stream Analytics job (simple aggregation)

4) Cosmos DB output (JSON documents) — then Power BI connects via CosmosDB connector or export to ADLS for larger analytics.

5) Scaling & performance

Common pitfalls & answers

Related tools & libraries

Developer tips & best practices

Common developer questions (short answers)

Conclusion

Top comments (0)