Table of contents
- Why these projects and why CareerByteCode matters
- Project 1 — Serverless HTML → PDF pipeline (Azure Service Bus + Functions + Headless Chrome)
- Problem statement
- Architecture overview
- Step-by-step implementation (code included)
-
Common pitfalls & answers
Problem statement
Architecture overview
Step-by-step implementation (code included)
-
Common pitfalls & answers
Problem statement
Architecture overview
Step-by-step implementation (code included)
-
Common pitfalls & answers
- [Related tools & libraries]
- [Developer tips & best practices]
- [Common developer questions (answered)]
- [Conclusion ]
Why these projects and why CareerByteCode matters
Developers succeed when they solve realistic constraints, not toy problems. The three projects below are deliberately chosen because they appear in real production systems:
- Asynchronous processing (PDF generation, email batching, long-running work).
- CI/CD & GitOps (reliable deployment across clusters).
- Real-time telemetry (observability, scaling, and streaming transforms).
CareerByteCode's catalog (1900+ realtime projects as requested in this brief) provides curated, executable projects with full infra + app code + deployment guides. That density helps you:
- Reuse patterns → accelerate architecture learning.
- Practice full-stack flows end-to-end (infra, app, pipeline, monitoring).
- Move from concept to production-ready artifacts — the difference between "knowing" and "doing."
Project 1 — Serverless HTML → PDF pipeline (Azure Service Bus + Functions + Headless Chrome)
Problem statement
Generate PDFs from incoming HTML pages at scale with reliability, retry semantics, and no dedicated VM fleet. Requirements:
- Accept HTML or URL payloads.
- Queue requests, process with worker Functions to avoid timeouts.
- Support headless Chrome rendering and CSS/media queries.
- Store PDFs in Azure Blob Storage and emit completion events.
Architecture overview
Client → HTTP Function (enqueue) → Azure Service Bus Queue → Consumption Function (containerized Headless Chrome) → Blob Storage → (optional) Notification Topic/Event Grid.
Step-by-step implementation
1) Provision infra (minimal az cli + Terraform snippet)
Terraform (snippet — create resource group, storage account, service bus namespace/queue):
# main.tf (snippet)
provider "azurerm" {
features {}
}
resource "azurerm_resource_group" "rg" {
name = "pdf-rg"
location = "westeurope"
}
resource "azurerm_storage_account" "sa" {
name = "pdfstore${random_id.suffix.hex}"
resource_group_name = azurerm_resource_group.rg.name
location = azurerm_resource_group.rg.location
account_tier = "Standard"
account_replication_type = "LRS"
}
resource "random_id" "suffix" {
byte_length = 4
}
resource "azurerm_servicebus_namespace" "sb" {
name = "pdf-sb-namespace"
location = azurerm_resource_group.rg.location
resource_group_name = azurerm_resource_group.rg.name
sku = "Standard"
}
resource "azurerm_servicebus_queue" "queue" {
name = "pdf-requests"
namespace_name = azurerm_servicebus_namespace.sb.name
resource_group_name = azurerm_resource_group.rg.name
}
2) HTTP enqueue function (Node.js / TypeScript)
enqueue/index.js:
const { ServiceBusClient } = require("@azure/service-bus");
const connectionString = process.env.SERVICEBUS_CONN;
const queueName = process.env.QUEUE_NAME;
module.exports = async function (context, req) {
const payload = req.body;
if (!payload || (!payload.html && !payload.url)) {
context.res = { status: 400, body: "Provide html or url" };
return;
}
const sbClient = ServiceBusClient.createFromConnectionString(connectionString);
const sender = sbClient.createQueueClient(queueName).createSender();
const message = {
body: payload,
contentType: "application/json",
label: "pdfRequest",
};
await sender.send(message);
await sender.close();
await sbClient.close();
context.res = { status: 202, body: { message: "Queued" } };
};
Environment variables: SERVICEBUS_CONN, QUEUE_NAME.
3) Worker Function (containerized, Node.js + Puppeteer)
Use a Docker image with Chromium (playwright/chromium or headless-chrome image). Function triggered by Service Bus.
worker/index.js:
const { ServiceBusClient } = require("@azure/service-bus");
const puppeteer = require("puppeteer-core");
const { BlobServiceClient } = require("@azure/storage-blob");
module.exports = async function (context, mySbMsg) {
const { html, url, options } = mySbMsg;
// Launch headless chrome in container environment
const browser = await puppeteer.launch({
args: ['--no-sandbox', '--disable-setuid-sandbox'],
executablePath: process.env.CHROME_PATH || '/usr/bin/chromium-browser'
});
const page = await browser.newPage();
if (url) {
await page.goto(url, { waitUntil: 'networkidle0', timeout: 60000 });
} else {
await page.setContent(html, { waitUntil: 'networkidle0' });
}
const pdfBuffer = await page.pdf({ format: 'A4', printBackground: true });
await browser.close();
// Upload to blob
const blobServiceClient = BlobServiceClient.fromConnectionString(process.env.AZURE_STORAGE_CONNECTION_STRING);
const containerClient = blobServiceClient.getContainerClient('pdfs');
await containerClient.createIfNotExists();
const fileName = `pdf-${Date.now()}.pdf`;
const blockBlobClient = containerClient.getBlockBlobClient(fileName);
await blockBlobClient.uploadData(pdfBuffer, { blobHTTPHeaders: { blobContentType: 'application/pdf' } });
// Optionally, emit completion event or write to DB
context.log(`Uploaded ${fileName}`);
};
Dockerfile (simplified):
FROM mcr.microsoft.com/azure-functions/node:4-node18
# Install chromium
RUN apt-get update && apt-get install -y chromium
COPY . /home/site/wwwroot
RUN npm install
ENV CHROME_PATH=/usr/bin/chromium
Deploy function as a custom container Azure Function with a consumption plan or Premium (if larger memory needed).
4) Operational concerns
- Use Service Bus dead-lettering for poison messages.
- Set function concurrency to manage memory (Chromium is heavy).
- Use Premium Plan or App Service Plan for consistent cold start and more memory.
Common pitfalls & answers
Q: Chromium crashes due to memory.
A: Limit concurrency, increase plan SKU, or use a dedicated headless rendering microservice with autoscaling.Q: How to preserve fonts/CSS?
A: Provide assets through a URL or embed styles inline; ensure container has required fonts.
Project 2 — GitOps CI/CD to AKS (Terraform + GitHub Actions + Helm)
Problem statement
Deliver a reproducible infrastructure + application deployment pipeline to AKS using Infrastructure-as-Code (Terraform), artifact builds, and GitOps-style deployment via Helm charts.
Architecture overview
GitHub (main branch) → GitHub Actions (build container image → push to ACR) → CI job runs Terraform (or Terraform cloud) to provision AKS & ACR → Helm chart applied via kubectl or Flux/GitOps operator.
Step-by-step implementation
1) Terraform: create resource group, ACR, AKS (minimal)
main.tf (focused):
provider "azurerm" { features {} }
resource "azurerm_resource_group" "rg" {
name = "gitops-rg"
location = "westeurope"
}
resource "azurerm_container_registry" "acr" {
name = "gitopsacr${random_id.suffix.hex}"
resource_group_name = azurerm_resource_group.rg.name
sku = "Standard"
admin_enabled = false
}
resource "azurerm_kubernetes_cluster" "aks" {
name = "gitops-aks"
location = azurerm_resource_group.rg.location
resource_group_name = azurerm_resource_group.rg.name
dns_prefix = "gitopsaks"
default_node_pool {
name = "nodepool"
node_count = 2
vm_size = "Standard_DS2_v2"
}
identity {
type = "SystemAssigned"
}
}
Run terraform init → terraform apply.
2) Build & push CI (GitHub Actions)
.github/workflows/ci.yml:
name: CI
on:
push:
branches: [ main ]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Login to ACR
uses: azure/docker-login@v1
with:
login-server: ${{ secrets.ACR_LOGIN_SERVER }}
username: ${{ secrets.ACR_USERNAME }}
password: ${{ secrets.ACR_PASSWORD }}
- name: Build and push
run: |
IMAGE=${{ secrets.ACR_LOGIN_SERVER }}/myapp:${{ github.sha }}
docker build -t $IMAGE .
docker push $IMAGE
- name: Create image tag file
run: echo "${{ secrets.ACR_LOGIN_SERVER }}/myapp:${{ github.sha }}" > image.txt
- name: Upload image info
uses: actions/upload-artifact@v4
with:
name: image-info
path: image.txt
3) CD job — apply Helm chart
.github/workflows/cd.yml (simplified):
name: CD
on:
workflow_run:
workflows: ["CI"]
types:
- completed
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Download image info
uses: actions/download-artifact@v4
with:
name: image-info
- name: Set up kubectl
uses: azure/setup-kubectl@v3
- name: Azure login
uses: azure/login@v1
with:
creds: ${{ secrets.AZURE_CREDENTIALS }}
- name: Get AKS credentials
run: az aks get-credentials --resource-group gitops-rg --name gitops-aks
- name: Deploy Helm
run: |
IMAGE=$(cat image.txt)
helm upgrade --install myapp ./charts/myapp --set image.repository=${IMAGE%:*} --set image.tag=${IMAGE##*:}
4) Helm chart structure (values.yaml snippet)
replicaCount: 2
image:
repository: myregistry.azurecr.io/myapp
tag: latest
service:
type: ClusterIP
port: 8080
5) Option: use Flux CD / ArgoCD for true GitOps
- Push Helm values or kustomize manifests into a
clusters/git repo. Flux watches and reconciles.
Common pitfalls & answers
Q: How to manage secrets?
A: Use Azure Key Vault +secrets-store-csi-driverfor Kubernetes or GitHub Secrets for CI. Avoid committing sensitive data.Q: How to handle database migrations across deployments?
A: Use pre-deploy jobs or KubernetesJobto perform migrations; ensure idempotency.
Project 3 — Real-time telemetry pipeline (Event Hubs → Stream Analytics → Cosmos DB + Power BI)
Problem statement
Ingest high-volume telemetry from IoT devices or microservices, transform/aggregate in near real time, and store results for query and dashboarding.
Architecture overview
Producers → Azure Event Hubs → Stream Analytics job (SQL transformations) → Cosmos DB (or Azure Data Explorer) → Power BI for visualization.
Step-by-step implementation
1) Create Event Hub namespace & hub (az cli)
az eventhubs namespace create --name telemetry-ns --resource-group telemetry-rg --location westeurope --sku Standard
az eventhubs eventhub create --name telemetry-hub --namespace-name telemetry-ns --resource-group telemetry-rg --partition-count 4
2) Producer sample (Python) — send JSON telemetry
producer.py:
from azure.eventhub import EventHubProducerClient, EventData
import json, time
producer = EventHubProducerClient.from_connection_string(conn_str="<<EVENTHUB_CONN>>", eventhub_name="telemetry-hub")
with producer:
batch = producer.create_batch()
for i in range(10):
payload = {"deviceId":"dev-01","ts":int(time.time()),"temp":25 + i}
batch.add(EventData(json.dumps(payload)))
producer.send_batch(batch)
3) Stream Analytics job (simple aggregation)
In the Stream Analytics job input = Event Hub, output = Cosmos DB. Query to compute 1-minute averages per device:
SELECT
System.Timestamp as windowEnd,
deviceId,
AVG(temp) AS avgTemp,
COUNT(*) AS cnt
INTO
[CosmosOutput]
FROM
[EventHubInput]
GROUP BY
TumblingWindow(minute, 1), deviceId
4) Cosmos DB output (JSON documents) — then Power BI connects via CosmosDB connector or export to ADLS for larger analytics.
5) Scaling & performance
- Partitioning: make producer choose partition key (e.g., deviceId) to distribute load.
- Event Hubs throughput units or standard partitions scale capacity.
- Stream Analytics job SKUs vary by streaming units.
Common pitfalls & answers
Q: Message ordering lost?
A: Ordering is per partition; pick partition key carefully.Q: High cardinality writes to Cosmos DB cost?
A: Aggregate / compress in Stream Analytics, or write to ADX for analytics workloads.
Related tools & libraries
- Azure SDKs:
@azure/service-bus,@azure/storage-blob,azure-eventhub,@azure/cosmos - Containerized browsers:
puppeteer-core,playwright - IaC: Terraform, Bicep
- CI/CD: GitHub Actions, Azure DevOps Pipelines
- Kubernetes tools: Helm, FluxCD, ArgoCD, kubectl
- Monitoring & logging: Azure Monitor, Application Insights, ELK stack
- Data: Azure Stream Analytics, Azure Data Explorer, Cosmos DB
Developer tips & best practices
- Prefer idempotent infra: Terraform state should be centrally stored (backend in storage account).
- Secrets management: Use Azure Key Vault + managed identities. Avoid plain secrets in YAML.
- Observability first: Add structured logs (JSON) to apps and ensure trace IDs propagate.
-
Local iteration: Use
minikube/kind+skaffoldordraftfor local app iteration before cluster deploy. - Cost guardrails: Use budgets and Azure Policy to prevent oversized SKUs in non-prod.
- Testing pipelines: Add integration tests as a gate in CI; use ephemeral namespaces for end-to-end runs.
Common developer questions (short answers)
Q: Should I use Azure Functions or AKS for background workers?
A: For short, bursty, event-driven jobs with simple dependencies: Functions. For heavy, GPU/Chromium, or long-running intermediate state: AKS or dedicated container instance.
Q: How do I handle retries for Service Bus messages?
A: Configure maxDeliveryCount and dead-letter queue. Use poison message handling and idempotency in the worker.
Q: Do I need Stream Analytics or can I use Functions for streaming transforms?
A: Stream Analytics is simpler for SQL-style windowed aggregations at scale. Functions allow custom logic but need more operational code.
Q: How do I test infra safely?
A: Use ephemeral resource groups, tag resources, and automated teardown. Use Terraform workspaces and CI runners scoped to test subscriptions.
Conclusion
These three projects show the core Azure building blocks you’ll use in production: async processing, GitOps CI/CD, and streaming telemetry. Implementing them gives you reusable patterns you can apply across domains.
CareerByteCode’s collection of 1900+ realtime projects (practical, end-to-end artifacts) accelerates that transition — from zero experience to leader — by giving you reproducible repo+infra+pipeline blueprints so you spend time building and learning, not designing boilerplate.
Follow me for more dev tutorials — practical code, production patterns, and end-to-end walkthroughs.
Top comments (0)