Build 3 Real-World Azure Projects - Problem Statements, Step-by-Step Solutions, and Code (Developer Focused)

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • MyrinNew
    Senior Member
    • Feb 2024
    • 5168

    #1

    Build 3 Real-World Azure Projects - Problem Statements, Step-by-Step Solutions, and Code (Developer Focused)

    Table of contents

    1. Why these projects and why CareerByteCode matters
    2. Project 1 — Serverless HTML → PDF pipeline (Azure Service Bus + Functions + Headless Chrome)
    • Problem statement
    • Architecture overview
    • Step-by-step implementation (code included)
    • Common pitfalls & answers
      1. Project 2 — GitOps CI/CD to AKS (Terraform + Azure DevOps/GitHub Actions + Helm)
    • Problem statement
    • Architecture overview
    • Step-by-step implementation (code included)
    • Common pitfalls & answers
      1. Project 3 — Real-time telemetry pipeline (Event Hubs → Stream Analytics → Cosmos DB)
    • Problem statement
    • Architecture overview
    • Step-by-step implementation (code included)
    • Common pitfalls & answers
      1. [Related tools & libraries]
      2. [Developer tips & best practices]
      3. [Common developer questions (answered)]
      4. [Conclusion ]


    Why these projects and why CareerByteCode matters

    Developers succeed when they solve realistic constraints, not toy problems. The three projects below are deliberately chosen because they appear in real production systems:
    • Asynchronous processing (PDF generation, email batching, long-running work).
    • CI/CD & GitOps (reliable deployment across clusters).
    • Real-time telemetry (observability, scaling, and streaming transforms).


    CareerByteCode's catalog (1900+ realtime projects as requested in this brief) provides curated, executable projects with full infra + app code + deployment guides. That density helps you:
    • Reuse patterns → accelerate architecture learning.
    • Practice full-stack flows end-to-end (infra, app, pipeline, monitoring).
    • Move from concept to production-ready artifacts — the difference between "knowing" and "doing."





    Project 1 — Serverless HTML → PDF pipeline (Azure Service Bus + Functions + Headless Chrome)

    Problem statement

    Generate PDFs from incoming HTML pages at scale with reliability, retry semantics, and no dedicated VM fleet. Requirements:
    • Accept HTML or URL payloads.
    • Queue requests, process with worker Functions to avoid timeouts.
    • Support headless Chrome rendering and CSS/media queries.
    • Store PDFs in Azure Blob Storage and emit completion events.


    Architecture overview

    Client → HTTP Function (enqueue) → Azure Service Bus Queue → Consumption Function (containerized Headless Chrome) → Blob Storage → (optional) Notification Topic/Event Grid.


    Step-by-step implementation

    1) Provision infra (minimal az cli + Terraform snippet)

    Terraform (snippet — create resource group, storage account, service bus namespace/queue):






    # main.tf (snippet)
    provider "azurerm" {
    features {}
    }

    resource "azurerm_resource_group" "rg" {
    name = "pdf-rg"
    location = "westeurope"
    }

    resource "azurerm_storage_account" "sa" {
    name = "pdfstore${random_id.suffix.hex}"
    resource_group_name = azurerm_resource_group.rg.name
    location = azurerm_resource_group.rg.location
    account_tier = "Standard"
    account_replication_type = "LRS"
    }

    resource "random_id" "suffix" {
    byte_length = 4
    }

    resource "azurerm_servicebus_namespace" "sb" {
    name = "pdf-sb-namespace"
    location = azurerm_resource_group.rg.location
    resource_group_name = azurerm_resource_group.rg.name
    sku = "Standard"
    }

    resource "azurerm_servicebus_queue" "queue" {
    name = "pdf-requests"
    namespace_name = azurerm_servicebus_namespace.sb.name
    resource_group_name = azurerm_resource_group.rg.name
    }







    2) HTTP enqueue function (Node.js / TypeScript)

    enqueue/index.js:






    const { ServiceBusClient } = require("@azure/service-bus");
    const connectionString = process.env.SERVICEBUS_CONN;
    const queueName = process.env.QUEUE_NAME;

    module.exports = async function (context, req) {
    const payload = req.body;
    if (!payload || (!payload.html && !payload.url)) {
    context.res = { status: 400, body: "Provide html or url" };
    return;
    }

    const sbClient = ServiceBusClient.createFromConnectionString(connec tionString);
    const sender = sbClient.createQueueClient(queueName).createSender ();
    const message = {
    body: payload,
    contentType: "application/json",
    label: "pdfRequest",
    };

    await sender.send(message);
    await sender.close();
    await sbClient.close();

    context.res = { status: 202, body: { message: "Queued" } };
    };







    Environment variables: SERVICEBUS_CONN, QUEUE_NAME.


    3) Worker Function (containerized, Node.js + Puppeteer)

    Use a Docker image with Chromium (playwright/chromium or headless-chrome image). Function triggered by Service Bus.


    worker/index.js:






    const { ServiceBusClient } = require("@azure/service-bus");
    const puppeteer = require("puppeteer-core");
    const { BlobServiceClient } = require("@azure/storage-blob");

    module.exports = async function (context, mySbMsg) {
    const { html, url, options } = mySbMsg;

    // Launch headless chrome in container environment
    const browser = await puppeteer.launch({
    args: ['--no-sandbox', '--disable-setuid-sandbox'],
    executablePath: process.env.CHROME_PATH || '/usr/bin/chromium-browser'
    });
    const page = await browser.newPage();

    if (url) {
    await page.goto(url, { waitUntil: 'networkidle0', timeout: 60000 });
    } else {
    await page.setContent(html, { waitUntil: 'networkidle0' });
    }

    const pdfBuffer = await page.pdf({ format: 'A4', printBackground: true });
    await browser.close();

    // Upload to blob
    const blobServiceClient = BlobServiceClient.fromConnectionString(process.env .AZURE_STORAGE_CONNECTION_STRING);
    const containerClient = blobServiceClient.getContainerClient('pdfs');
    await containerClient.createIfNotExists();
    const fileName = `pdf-${Date.now()}.pdf`;
    const blockBlobClient = containerClient.getBlockBlobClient(fileName);
    await blockBlobClient.uploadData(pdfBuffer, { blobHTTPHeaders: { blobContentType: 'application/pdf' } });

    // Optionally, emit completion event or write to DB
    context.log(`Uploaded ${fileName}`);
    };







    Dockerfile (simplified):






    FROM mcr.microsoft.com/azure-functions/node:4-node18
    # Install chromium
    RUN apt-get update && apt-get install -y chromium
    COPY . /home/site/wwwroot
    RUN npm install
    ENV CHROME_PATH=/usr/bin/chromium







    Deploy function as a custom container Azure Function with a consumption plan or Premium (if larger memory needed).


    4) Operational concerns

    • Use Service Bus dead-lettering for poison messages.
    • Set function concurrency to manage memory (Chromium is heavy).
    • Use Premium Plan or App Service Plan for consistent cold start and more memory.


    Common pitfalls & answers

    • Q: Chromium crashes due to memory.

      A: Limit concurrency, increase plan SKU, or use a dedicated headless rendering microservice with autoscaling.
    • Q: How to preserve fonts/CSS?

      A: Provide assets through a URL or embed styles inline; ensure container has required fonts.





    Project 2 — GitOps CI/CD to AKS (Terraform + GitHub Actions + Helm)

    Problem statement

    Deliver a reproducible infrastructure + application deployment pipeline to AKS using Infrastructure-as-Code (Terraform), artifact builds, and GitOps-style deployment via Helm charts.


    Architecture overview

    GitHub (main branch) → GitHub Actions (build container image → push to ACR) → CI job runs Terraform (or Terraform cloud) to provision AKS & ACR → Helm chart applied via kubectl or Flux/GitOps operator.


    Step-by-step implementation

    1) Terraform: create resource group, ACR, AKS (minimal)

    main.tf (focused):






    provider "azurerm" { features {} }

    resource "azurerm_resource_group" "rg" {
    name = "gitops-rg"
    location = "westeurope"
    }

    resource "azurerm_container_registry" "acr" {
    name = "gitopsacr${random_id.suffix.hex}"
    resource_group_name = azurerm_resource_group.rg.name
    sku = "Standard"
    admin_enabled = false
    }

    resource "azurerm_kubernetes_cluster" "aks" {
    name = "gitops-aks"
    location = azurerm_resource_group.rg.location
    resource_group_name = azurerm_resource_group.rg.name
    dns_prefix = "gitopsaks"

    default_node_pool {
    name = "nodepool"
    node_count = 2
    vm_size = "Standard_DS2_v2"
    }

    identity {
    type = "SystemAssigned"
    }
    }







    Run terraform init → terraform apply.


    2) Build & push CI (GitHub Actions)

    .github/workflows/ci.yml:






    name: CI

    on:
    push:
    branches: [ main ]

    jobs:
    build:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v4
    - name: Set up Docker Buildx
    uses: docker/setup-buildx-action@v3
    - name: Login to ACR
    uses: azure/docker-login@v1
    with:
    login-server: ${{ secrets.ACR_LOGIN_SERVER }}
    username: ${{ secrets.ACR_USERNAME }}
    password: ${{ secrets.ACR_PASSWORD }}
    - name: Build and push
    run: |
    IMAGE=${{ secrets.ACR_LOGIN_SERVER }}/myapp:${{ github.sha }}
    docker build -t $IMAGE .
    docker push $IMAGE
    - name: Create image tag file
    run: echo "${{ secrets.ACR_LOGIN_SERVER }}/myapp:${{ github.sha }}" > image.txt
    - name: Upload image info
    uses: actions/upload-artifact@v4
    with:
    name: image-info
    path: image.txt







    3) CD job — apply Helm chart

    .github/workflows/cd.yml (simplified):






    name: CD

    on:
    workflow_run:
    workflows: ["CI"]
    types:
    - completed

    jobs:
    deploy:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v4
    - name: Download image info
    uses: actions/download-artifact@v4
    with:
    name: image-info
    - name: Set up kubectl
    uses: azure/setup-kubectl@v3
    - name: Azure login
    uses: azure/login@v1
    with:
    creds: ${{ secrets.AZURE_CREDENTIALS }}
    - name: Get AKS credentials
    run: az aks get-credentials --resource-group gitops-rg --name gitops-aks
    - name: Deploy Helm
    run: |
    IMAGE=$(cat image.txt)
    helm upgrade --install myapp ./charts/myapp --set image.repository=${IMAGE%:*} --set image.tag=${IMAGE##*:}







    4) Helm chart structure (values.yaml snippet)





    replicaCount: 2
    image:
    repository: myregistry.azurecr.io/myapp
    tag: latest
    service:
    type: ClusterIP
    port: 8080







    5) Option: use Flux CD / ArgoCD for true GitOps

    • Push Helm values or kustomize manifests into a clusters/ git repo. Flux watches and reconciles.


    Common pitfalls & answers

    • Q: How to manage secrets?

      A: Use Azure Key Vault + secrets-store-csi-driver for Kubernetes or GitHub Secrets for CI. Avoid committing sensitive data.
    • Q: How to handle database migrations across deployments?

      A: Use pre-deploy jobs or Kubernetes Job to perform migrations; ensure idempotency.





    Project 3 — Real-time telemetry pipeline (Event Hubs → Stream Analytics → Cosmos DB + Power BI)

    Problem statement

    Ingest high-volume telemetry from IoT devices or microservices, transform/aggregate in near real time, and store results for query and dashboarding.


    Architecture overview

    Producers → Azure Event Hubs → Stream Analytics job (SQL transformations) → Cosmos DB (or Azure Data Explorer) → Power BI for visualization.


    Step-by-step implementation

    1) Create Event Hub namespace & hub (az cli)





    az eventhubs namespace create --name telemetry-ns --resource-group telemetry-rg --location westeurope --sku Standard
    az eventhubs eventhub create --name telemetry-hub --namespace-name telemetry-ns --resource-group telemetry-rg --partition-count 4







    2) Producer sample (Python) — send JSON telemetry

    producer.py:






    from azure.eventhub import EventHubProducerClient, EventData
    import json, time
    producer = EventHubProducerClient.from_connection_string(conn _str=">", eventhub_name="telemetry-hub")
    with producer:
    batch = producer.create_batch()
    for i in range(10):
    payload = {"deviceId":"dev-01","ts":int(time.time()),"temp":25 + i}
    batch.add(EventData(json.dumps(payload)))
    producer.send_batch(batch)







    3) Stream Analytics job (simple aggregation)

    In the Stream Analytics job input = Event Hub, output = Cosmos DB. Query to compute 1-minute averages per device:






    SELECT
    System.Timestamp as windowEnd,
    deviceId,
    AVG(temp) AS avgTemp,
    COUNT(*) AS cnt
    INTO
    [CosmosOutput]
    FROM
    [EventHubInput]
    GROUP BY
    TumblingWindow(minute, 1), deviceId







    4) Cosmos DB output (JSON documents) — then Power BI connects via CosmosDB connector or export to ADLS for larger analytics.

    5) Scaling & performance

    • Partitioning: make producer choose partition key (e.g., deviceId) to distribute load.
    • Event Hubs throughput units or standard partitions scale capacity.
    • Stream Analytics job SKUs vary by streaming units.


    Common pitfalls & answers

    • Q: Message ordering lost?

      A: Ordering is per partition; pick partition key carefully.
    • Q: High cardinality writes to Cosmos DB cost?

      A: Aggregate / compress in Stream Analytics, or write to ADX for analytics workloads.





    Related tools & libraries

    • Azure SDKs: @azure/service-bus, @azure/storage-blob, azure-eventhub, @azure/cosmos
    • Containerized browsers: puppeteer-core, playwright
    • IaC: Terraform, Bicep
    • CI/CD: GitHub Actions, Azure DevOps Pipelines
    • Kubernetes tools: Helm, FluxCD, ArgoCD, kubectl
    • Monitoring & logging: Azure Monitor, Application Insights, ELK stack
    • Data: Azure Stream Analytics, Azure Data Explorer, Cosmos DB





    Developer tips & best practices

    • Prefer idempotent infra: Terraform state should be centrally stored (backend in storage account).
    • Secrets management: Use Azure Key Vault + managed identities. Avoid plain secrets in YAML.
    • Observability first: Add structured logs (JSON) to apps and ensure trace IDs propagate.
    • Local iteration: Use minikube/kind + skaffold or draft for local app iteration before cluster deploy.
    • Cost guardrails: Use budgets and Azure Policy to prevent oversized SKUs in non-prod.
    • Testing pipelines: Add integration tests as a gate in CI; use ephemeral namespaces for end-to-end runs.





    Common developer questions (short answers)

    Q: Should I use Azure Functions or AKS for background workers?

    A: For short, bursty, event-driven jobs with simple dependencies: Functions. For heavy, GPU/Chromium, or long-running intermediate state: AKS or dedicated container instance.


    Q: How do I handle retries for Service Bus messages?

    A: Configure maxDeliveryCount and dead-letter queue. Use poison message handling and idempotency in the worker.


    Q: Do I need Stream Analytics or can I use Functions for streaming transforms?

    A: Stream Analytics is simpler for SQL-style windowed aggregations at scale. Functions allow custom logic but need more operational code.


    Q: How do I test infra safely?

    A: Use ephemeral resource groups, tag resources, and automated teardown. Use Terraform workspaces and CI runners scoped to test subscriptions.





    Conclusion

    These three projects show the core Azure building blocks you’ll use in production: async processing, GitOps CI/CD, and streaming telemetry. Implementing them gives you reusable patterns you can apply across domains.


    CareerByteCode’s collection of 1900+ realtime projects (practical, end-to-end artifacts) accelerates that transition — from zero experience to leader — by giving you reproducible repo+infra+pipeline blueprints so you spend time building and learning, not designing boilerplate.


    Follow me for more dev tutorials — practical code, production patterns, and end-to-end walkthroughs.




    More...
Working...