Vanny Durby

Posted on Oct 9

From Code to AI-Powered Web: Master NLWeb Deployment with Kubernetes GitOps & FluxCD

#kubernetes #gitops #fluxcd

The digital frontier is constantly expanding, and at its bleeding edge sits the convergence of artificial intelligence and web applications. Imagine websites that don't just display information but actively understand, process, and respond to user queries with intelligent, contextual awareness. This isn't science fiction; it's the promise of NLWeb, Microsoft's innovative open-source protocol. NLWeb is transforming traditional websites into dynamic, AI-driven knowledge hubs, capable of seamless integration with vector databases, multiple Large Language Model (LLM) providers, and diverse enterprise data sources.

But building such revolutionary applications is only half the battle. Deploying and managing them efficiently, reliably, and scalably in a production environment is the other, equally critical half. This is where Kubernetes, the undisputed champion of container orchestration, and GitOps, the gold standard for declarative infrastructure management, come into play. When you combine NLWeb's intelligent capabilities with the robust operational kraken that is Kubernetes, powered by continuous deployment via FluxCD, you unlock a powerful, production-ready ecosystem.

In this comprehensive guide, originally inspired by an insightful article on iunera.com about NLWeb Deployment in Kubernetes GitOps Style with FluxCD, we'll explore the intricacies of deploying NLWeb using modern DevOps practices. We'll dive into leveraging FluxCD for automated continuous deployment and touch upon Azure's robust cloud infrastructure for cloud-native AI applications.

NLWeb: Revolutionizing AI Web Applications

NLWeb isn't just another framework; it represents a fundamental shift in how we conceive and interact with web applications. Unlike the static pages or even dynamic, CRUD-based applications we're accustomed to, NLWeb empowers AI-powered websites to truly understand and engage. It acts as an intelligent layer, enabling natural language understanding and delivering contextually relevant responses, making web experiences profoundly more interactive.

At its core, NLWeb's architecture embraces modern cloud-native principles and adheres to CNCF best practices. This ensures not only scalability but also resilience and flexibility. A key design philosophy is its multi-provider support, integrating seamlessly with various AI providers for embedding and LLM functionalities. This means you're not locked into a single vendor; NLWeb supports embedding providers like OpenAI, Azure OpenAI, Gemini, and Snowflake, and offers flexible LLM integration with providers ranging from Anthropic's Claude AI assistant to various Hugging Face models. This multi-provider approach is a game-changer, fostering resilience, allowing for cost optimization, and enabling experimentation with cutting-edge models without vendor lock-in.

While incredibly powerful, it's worth noting that NLWeb is currently in its early stages of development. However, its design prioritizes ease of use and production deployment considerations, and the community actively contributes bug fixes and enhancements to accelerate its journey towards enterprise readiness. For a deeper dive into how NLWeb processes queries, check out NLWeb's AI Demystified: How an Example Query is Processed in NLWeb.

Why GitOps with FluxCD for NLWeb Deployments?

In the world of Kubernetes, GitOps has emerged as the gold standard for managing deployments. It's a declarative infrastructure management methodology where Git repositories serve as the single source of truth for all infrastructure and application configurations. This approach brings unparalleled levels of automation, auditability, and reliability, making it a perfect fit for NLWeb's cloud-native architecture.

The Core Principles of GitOps:

Declarative Infrastructure: Your desired state for the entire system (applications, infrastructure, configurations) is described in Git, using manifests like YAML files and Helm charts.
Version Control: Every change to your infrastructure is a Git commit, providing a complete, auditable history and easy rollback capabilities.
Automated Deployments: A specialized operator (like FluxCD) continuously monitors the Git repository. When changes are detected, it automatically applies them to the Kubernetes cluster, ensuring that the cluster's actual state converges with the desired state defined in Git.
Consistency: The same deployment process is used across all environments – development, staging, and production – eliminating configuration drift and manual errors.

For NLWeb deployments, iunera.com provides production-ready Helm charts that encapsulate years of operational experience and best practices. These charts streamline the deployment process, making it straightforward to get NLWeb up and running consistently across various Kubernetes environments.

FluxCD acts as the vigilant GitOps operator in this scenario. It continuously monitors the specified Git repository for changes to your application and infrastructure definitions. Upon detecting a change, FluxCD automatically synchronizes your Kubernetes cluster to reflect that desired state. This eliminates configuration drift, drastically reduces the need for manual intervention, and provides a clear, immutable audit trail of every change applied to your system. The benefits for NLWeb are clear: robust, automated, and auditable deployments.

Diving Deep: NLWeb's Kubernetes Architecture

Deploying NLWeb on Kubernetes involves orchestrating several key components that collaborate to deliver those intelligent AI-powered web experiences. Let's break down the technical architecture.

Core Components and Configuration

At its heart, the NLWeb application is a Python-based service. It's typically packaged into a Docker image, such as iunera/nlweb, and configured to serve on port 8000. Crucially, it includes comprehensive health checks (liveness and readiness probes) to ensure the application is not only running but also capable of serving requests.

NLWeb employs a sophisticated configuration system, leveraging multiple YAML files to manage distinct aspects of its behavior:

config_webserver.yaml: Controls server settings, CORS policies, SSL configuration, and static file serving.
config_llm.yaml: Manages Large Language Model (LLM) provider configurations and model selections.
config_embedding.yaml: Defines embedding provider settings and model preferences.
config_llm_performance.yaml: Optimizes application performance through caching, rate limiting, and response management.

Security Context and Best Practices

In a production Kubernetes environment, security is paramount. NLWeb's deployment adheres to Kubernetes pod security standards and best practices, including:

Non-root user execution (UID 999): Minimizes the impact of potential container breakouts.
Read-only root filesystem: Prevents malicious processes from modifying critical system files.
Dropped capabilities: Removes unnecessary Linux capabilities, further hardening the container.
Security contexts: Applied at both the pod and container levels for fine-grained access control.

This robust architecture provides a secure, scalable, and maintainable foundation for NLWeb, enabling it to thrive in demanding production environments.

Helm Chart Structure and Values

The NLWeb Helm chart is designed for extensive customization, allowing developers to tailor deployments to specific needs through its values.yaml configuration. A basic example showcasing some core values might look like this:

replicaCount: 1
image:
  repository: iunera/nlweb
  pullPolicy: IfNotPresent

service:
  type: ClusterIP
  port: 8000

env:
  - name: PYTHONPATH
    value: "/app"
  - name: PORT
    value: "8000"
  - name: NLWEB_LOGGING_PROFILE
    value: production

Beyond these basic settings, the chart supports a rich array of advanced features, essential for production deployments:

Autoscaling: Configuration for Horizontal Pod Autoscaler (HPA) with CPU-based scaling to dynamically adjust replica counts.
Ingress: Integration with NGINX ingress controllers, including SSL/TLS termination for secure external access.
Volumes: Support for Persistent Volume Claims (PVCs), ConfigMaps, and EmptyDir volumes to manage data and configuration.
ConfigMaps: Detailed mechanisms to configure NLWeb's various settings (LLM, vector endpoints, etc.) directly from Kubernetes ConfigMaps.
Security: Further enforcement of pod security contexts and network policies to isolate and protect the application.

FluxCD in Action: Automating NLWeb Deployments

FluxCD is more than just a deployment tool; it's a continuous delivery solution for Kubernetes that empowers GitOps. It acts as the bridge between your Git repository and your Kubernetes cluster, ensuring that any changes committed to your manifest files are automatically, consistently, and reliably applied.

The HelmRelease Controller

Central to FluxCD's GitOps approach for NLWeb is the HelmRelease custom resource. This powerful Custom Resource Definition (CRD) manages the entire lifecycle of a Helm chart deployment. Here's a typical HelmRelease configuration for NLWeb:

apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: nlweb
  namespace: nlweb
spec:
  releaseName: nlweb
  targetNamespace: nlweb
  chart:
    spec:
      chart: nlweb
      version: ">=1.1.0"
      sourceRef:
        kind: HelmRepository
        name: iunera-helm-charts
        namespace: helmrepos
  interval: 1m0s

This manifest instructs FluxCD to continuously monitor the iunera-helm-charts repository. The interval: 1m0s setting ensures FluxCD checks for updates every minute, providing near real-time deployment capabilities. When a new chart version or configuration change is detected in the Git repository, FluxCD automatically performs an upgrade on the NLWeb release in the cluster.

Image Automation and Version Management

Keeping container images up-to-date manually is tedious and error-prone. FluxCD's image automation capabilities elegantly solve this problem for NLWeb deployments. The system can automatically detect new container image versions published to a registry and update the corresponding deployment manifests in Git. This is invaluable for maintaining up-to-date deployments while integrating with proper testing and validation workflows.

To enable this, NLWeb deployments leverage special annotations within the HelmRelease manifest, as shown below:

image:
  repository: iunera/nlweb # {"$imagepolicy": "flux-system:nlweb:name"}
  tag: 1.2.4 # {"$imagepolicy": "flux-system:nlweb:tag"}

These annotations serve as directives for FluxCD, telling it to automatically update the image repository and tag values based on an ImagePolicy defined elsewhere. When a new image version is detected that matches the policy criteria, FluxCD not only updates the manifest but also commits these changes back to the Git repository, maintaining the single source of truth.

Image Repository and Policy Configuration

The image automation magic is configured through two key resources, often defined in a file like nlweb.imagerepo.yaml:


# ImageRepository defines the Docker image repository to monitor
apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImageRepository
metadata:
  name: nlweb
  namespace: flux-system
spec:
  image: iunera/nlweb
  interval: 10m
  secretRef:
    name: iunera

---

# ImagePolicy defines which image versions to select
apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImagePolicy
metadata:
  name: nlweb
  namespace: flux-system
spec:
  imageRepositoryRef:
    name: nlweb
  policy:
    semver:
      range: ">=1.0.0"

The ImageRepository resource specifies the Docker image to monitor (iunera/nlweb), how frequently to check for new versions (interval: 10m), and authentication credentials via secretRef for private registries. The ImagePolicy resource, on the other hand, defines the selection criteria for image versions using semantic versioning, ensuring only compatible updates (e.g., >=1.0.0) are applied.

Automation Workflow

The entire image update automation workflow is orchestrated by the ImageUpdateAutomation resource:

apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImageUpdateAutomation
metadata:
  name: flux-system
  namespace: flux-system
spec:
  git:
    checkout:
      ref:
        branch: master
    commit:
      author:
        email: fluxcdbot@nodomain.local
        name: fluxcdbot
      messageTemplate: |
        Automated image update

        Automation name: {{ .AutomationObject }}

        Files:
        {{ range $filename, $_ := .Changed.FileChanges -}}
        - {{ $filename }}
        {{ end -}}

        Objects:
        {{ range $resource, $changes := .Changed.Objects -}}
        - {{ $resource.Kind }} {{ $resource.Name }}
          Changes:
        {{- range $_, $change := $changes }}
            - {{ $change.OldValue }} -> {{ $change.NewValue }}
        {{ end -}}
        {{ end -}}
    push:
      branch: master
  interval: 30m0s
  sourceRef:
    kind: GitRepository
    name: flux-system
  update:
    path: ./kubernetes/common
    strategy: Setters

This resource ensures that FluxCD:

Checks out the master branch of your Git repository.
Commits changes with a descriptive message template that details what was updated.
Pushes these changes back to the master branch.
Executes this process every 30 minutes.
Applies updates within the ./kubernetes/common path, using the "Setters" strategy to look for image policy annotations.

Through this sophisticated configuration, your NLWeb deployment automatically stays current with the latest compatible container images, all without manual intervention. This entire process is meticulously tracked in your Git history, providing a complete audit trail and easy rollback capabilities. It's a testament to the power of declarative, automated operations.

Building & Deploying: The CI/CD Pipeline

The journey of NLWeb from source code to a running application in Kubernetes is facilitated by a robust CI/CD pipeline, integrating Docker builds with the GitOps workflow powered by FluxCD.

Dockerfile Structure and Multi-Stage Build

An efficient and secure container image is foundational to any cloud-native application. NLWeb leverages a Docker multi-stage build process to achieve this, separating build dependencies from the final runtime environment. This results in smaller, more secure images.


# Stage 1: Build stage
FROM python:3.13-slim AS builder

# Install build dependencies
RUN apt-get update && \
    apt-get install -y --no-install-recommends gcc python3-dev && \
    pip install --no-cache-dir --upgrade pip && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

WORKDIR /app

# Copy requirements file
COPY code/requirements.txt .

# Install Python packages
RUN pip install --no-cache-dir -r requirements.txt

# Copy requirements file
COPY docker_requirements.txt .

# Install Python packages
RUN pip install --no-cache-dir -r docker_requirements.txt

# Stage 2: Runtime stage
FROM python:3.13-slim

# Apply security updates
RUN apt-get update && \
   apt-get install -y --no-install-recommends --only-upgrade \
       $(apt-get --just-print upgrade | grep "^Inst" | grep -i securi | awk '{print $2}') && \
   apt-get clean && \
   rm -rf /var/lib/apt/lists/*

WORKDIR /app

# Create a non-root user and set permissions
RUN groupadd -r nlweb && \
    useradd -r -g nlweb -d /app -s /bin/bash nlweb && \
    chown -R nlweb:nlweb /app

USER nlweb

# Copy application code
COPY code/ /app/
COPY static/ /app/static/

# Copy installed packages from builder stage
COPY --from=builder /usr/local/lib/python3.13/site-packages /usr/local/lib/python3.13/site-packages
COPY --from=builder /usr/local/bin /usr/local/bin

# Expose the port the app runs on
EXPOSE 8000

# Set environment variables
ENV NLWEB_OUTPUT_DIR=/app
ENV PYTHONPATH=/app
ENV PORT=8000

ENV VERSION=1.2.4

# Command to run the application
CMD ["python", "app-file.py"]

Key aspects of this Dockerfile include:

Stage 1 (Builder): Installs all necessary build-time dependencies, ensuring the final image is lean.
Stage 2 (Runtime): Creates a minimal, secure runtime environment, applying security updates and removing build tools.
Security Features: Enforces non-root user execution (USER nlweb), a best practice for container security, and ensures minimal dependencies are present.
Version Definition: The ENV VERSION variable is critical for consistent image tagging.

GitHub Actions Workflow

When changes are pushed to the iuneracustomizations branch and the Dockerfile is modified, a GitHub Actions CI/CD automation workflow, typically defined in .github/workflows/prod-build.yml, is triggered. This workflow orchestrates the build and push process for the NLWeb Docker image.

name: prod-build

on:
  push:
    branches:
      - iuneracustomizations
    paths:
      - Dockerfile

jobs:
  build-and-push:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout Repository
        uses: actions/checkout@v4

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3

      - name: Log in to Private Registry
        uses: docker/login-action@v3
        with:
          username: ${{ secrets.DOCKERHUB_USERNAME }}
          password: ${{ secrets.DOCKERHUB_TOKEN }}

      - name: Set up QEMU
        uses: docker/setup-qemu-action@v3
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3

      - name: Extract Version from Dockerfile
        id: extract_version
        run: |
          # Extract the VERSION from Dockerfile
          VERSION=$(grep "ENV VERSION=" Dockerfile | cut -d= -f2)
          echo "VERSION=${VERSION}" >> $GITHUB_ENV
          echo "Using version from Dockerfile: ${VERSION}"

      - name: Build the Docker image
        run: |
          docker build -t iunera/nlweb:latest -t iunera/nlweb:${{ env.VERSION }} .
          docker push iunera/nlweb:latest
          docker push iunera/nlweb:${{ env.VERSION }}
          echo "Built and pushed Docker image with tags: latest, ${{ env.VERSION }}"

      - name: Inspect
        run: |
          docker image inspect iunera/nlweb:latest

      - name: Create and Push Git Tag
        run: |
          git config --global user.name "GitHub Actions"
          git config --global user.email "actions@github.com"
          git tag -a v${{ env.VERSION }} -m "Release version ${{ env.VERSION }}"
          git push origin v${{ env.VERSION }}

This workflow handles crucial steps:

Checkout Repository: Clones the codebase.
Set up Docker Buildx & QEMU: Configures Docker for multi-architecture builds (ARM64, AMD64), essential for modern cloud environments.
Log in to Private Registry: Authenticates with Docker Hub (or any other registry) using GitHub Secrets.
Extract Version: Dynamically parses the Dockerfile to get the VERSION environment variable, ensuring consistency.
Build and Push: Builds the Docker image and tags it with both latest and the extracted version number, pushing both to Docker Hub.
Create Git Tag: Creates and pushes a Git tag, linking the code version to the deployed image.

Complete CI/CD to Deployment Flow

This intricate pipeline connects development to deployment seamlessly:

Development: A developer modifies the Dockerfile (perhaps updating the VERSION) or application code.
CI/CD: GitHub Actions automatically builds and pushes the new Docker image to Docker Hub.
Automation: FluxCD, continuously monitoring Docker Hub, detects the new image version.
GitOps: FluxCD updates the Kubernetes manifests in the Git repository with the new image version and commits these changes.
Deployment: FluxCD, observing the Git repository, applies these changes to the Kubernetes cluster, deploying new pods with the updated NLWeb image.

This GitOps approach ensures your Git repository is always the single source of truth, all changes are auditable, deployments are automated and consistent, and rollbacks are straightforward and reliable.

Local Development Environment

While GitHub Actions handles production builds, local development for NLWeb is streamlined using Docker Compose, providing a consistent environment:

services:
  nlweb:
    build:
      context: .
      dockerfile: Dockerfile
    container_name: nlweb
    ports:
      - "8000:8000"
    env_file:
      - ./code/.env
    environment:
      - PYTHONPATH=/app
      - PORT=8000
    volumes:
      - ./data:/data
      - ./code/config:/app/config:ro
    healthcheck:
      test: ["CMD-SHELL", "python -c \"import urllib.request; urllib.request.urlopen('http://localhost:8000')\""]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 10s
    restart: unless-stopped
    user: nlweb

This Docker Compose setup mirrors production by using the same Dockerfile, mounting local directories for data and configuration, loading environment variables, and including health checks, all while running as the secure, non-root nlweb user.

Cloud-Native with Azure: Powering AI Infrastructure

For organizations deeply embedded in Microsoft's cloud ecosystem, NLWeb offers exceptional synergy with Azure services. This native integration makes it an ideal choice for building scalable, enterprise-grade AI solutions.

NLWeb natively supports:

Azure Cognitive Search: For robust vector search capabilities, NLWeb integrates with Azure's vector search service, enabling scalable and performant similarity searches across massive datasets. This is a critical component for Retrieval Augmented Generation (RAG) patterns in AI applications. For more on optimizing RAG, consider exploring Enterprise AI Excellence: How to do an Agentic Enterprise RAG and Polyglot Knowledge RAG Ingestion Concept for Enterprise Ready AIs.
Azure OpenAI Service: Direct integration with Azure's enterprise-grade OpenAI offerings, including models like GPT-4 and its embedding models. This ensures high-performance AI capabilities with the added benefits of Azure's security, compliance, and governance features.
Azure Container Registry (ACR): Seamless integration with ACR for secure container image management, storage, and vulnerability scanning, completing the cloud-native CI/CD loop.

Configuration for these Azure services is elegantly handled through Kubernetes environment variables and ConfigMaps, allowing for easy management across different environments while adhering to security best practices:

env:
  - name: AZURE_VECTOR_SEARCH_ENDPOINT
    value: "https://your-vector-search-db.search.windows.net"
  - name: AZURE_OPENAI_ENDPOINT
    value: "https://your-openai-instance.openai.azure.com/"

Achieving Production-Readiness with NLWeb

Beyond basic deployment, NLWeb is engineered with several production-ready features that make it suitable for demanding enterprise workloads.

Multi-Provider LLM Support

As highlighted earlier, NLWeb's multi-provider LLM support is a cornerstone of its design. It supports a diverse range of models from:

OpenAI: GPT-4.1, GPT-4.1-mini, GPT-4-turbo, GPT-3.5-turbo.
Anthropic: Claude-3-7-sonnet-latest, Claude-3-5-haiku-latest, Claude-3-opus-20240229.
Azure OpenAI: Enterprise-grade models with Azure's security.
Google Gemini: chat-bison models, Gemini 1.5 Pro, Gemini 1.5 Flash.
Snowflake: Arctic embedding models and Claude integration.
Hugging Face: Various open-source models, including the Qwen2.5 series and sentence-transformers/all-mpnet-base-v2.

This breadth of support allows organizations to strategically optimize costs by using different models for varying use cases, enhance service availability through provider redundancy, and experiment with cutting-edge models without being tethered to a single vendor. It also provides flexibility to integrate with custom models, potentially from an Enterprise MCP Server Development project.

Performance Optimization and Caching

To ensure NLWeb performs optimally and minimizes API costs (especially critical with LLM interactions), sophisticated caching mechanisms are implemented:

cache:
  enable: true
  max_size: 1000
  ttl: 0  # No expiration
  include_schema: true
  include_provider: true
  include_model: true

This caching system is intelligent, considering factors like schema, provider, and model when generating cache keys. This ensures high cache hit rates while maintaining the accuracy and relevance of AI responses. Further strategies are detailed in the config_llm_performance.yaml discussed in the next section.

Enterprise Data Integration

NLWeb's true power shines through its ability to seamlessly integrate with diverse enterprise data sources. Building on foundational concepts for data exposure (as discussed in Guide: Exposing Enterprise Data with Java and Spring for AI Indexing (for NLWeb)), NLWeb supports:

JSON-LD and Schema.org: Facilitates structured data integration for enhanced semantic web capabilities, making data more understandable by AI. More on this can be found in How Markdown, JSON-LD and Schema.org Improve Vectorsearch RAGs and NLWeb.
Vector Database Integration: Compatible with various vector databases, including Azure Cognitive Search, for efficient similarity search.
Real-time Data Processing: Capabilities for stream processing to handle dynamic content updates.
Enterprise Security: Features like role-based access control (RBAC) and data governance to protect sensitive information.

NLWeb GitOps vs. Traditional Approaches: A Comparison

To truly appreciate the value of NLWeb deployed with GitOps and FluxCD, it's helpful to contrast it with more traditional deployment methods:

Scalability: GitOps-managed NLWeb on Kubernetes offers auto-scaling via Horizontal Pod Autoscalers (HPAs), dynamically adjusting to demand. Traditional methods often involve limited vertical scaling or cumbersome manual scaling.
Deployment Speed: Automated deployments via GitOps mean changes are rapidly applied from commit to production. Manual or portal-based deployments are inherently slower and more error-prone.
Configuration Management: Git-based versioning provides a complete history and single source of truth for NLWeb configurations. Traditional approaches often rely on portal-based settings or disparate file-based configurations across servers.
Multi-environment Support: Kubernetes namespaces and Helm charts provide native support for consistent multi-environment deployments. Traditional methods require separate application instances or even entire server fleets.
Rollback Capabilities: Git-based rollbacks are fast, reliable, and well-audited with GitOps. Other methods offer limited or manual rollback processes.
Cost Optimization: Kubernetes allows for resource-based pricing and efficient utilization. Traditional infrastructure often incurs higher fixed infrastructure costs or less granular cloud billing.
Monitoring & Observability: NLWeb on Kubernetes leverages robust Kubernetes-native tools and integrations with platforms like Azure Monitor. Traditional setups require custom monitoring configurations.
Security: Kubernetes' pod security contexts combined with network policies provide strong security. Traditional Linux installs require extensive manual security hardening.

The iunera.com Helm charts specifically designed for NLWeb provide a significant advantage in this comparison, offering production-tested configurations that mitigate common deployment pitfalls and accelerate time to value.

Advanced Configuration: Real-World Examples

This section delves into practical, production-ready configuration examples, providing templates for your NLWeb deployments.

Complete Helm Installation Manifest Examples

Basic Development Setup

A minimal HelmRelease for a development environment:

apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: nlweb-dev
  namespace: nlweb-dev
spec:
  releaseName: nlweb-dev
  targetNamespace: nlweb-dev
  chart:
    spec:
      chart: nlweb
      version: ">=1.1.0"
      sourceRef:
        kind: HelmRepository
        name: iunera-helm-charts
        namespace: helmrepos
  interval: 5m0s
  install:
    createNamespace: true
  values:
    replicaCount: 1
    image:
      repository: iunera/nlweb
      tag: "latest"
      pullPolicy: Always

    env:
      - name: NLWEB_LOGGING_PROFILE
        value: development
      - name: OPENAI_API_KEY
        valueFrom:
          secretKeyRef:
            name: nlweb-secrets
            key: openai-api-key

    ingress:
      enabled: true
      annotations:
        kubernetes.io/ingress.class: nginx
      hosts:
        - host: nlweb-dev.local
          paths:
            - path: /
              pathType: ImplementationSpecific

    resources:
      requests:
        cpu: 100m
        memory: 512Mi
      limits:
        cpu: 500m
        memory: 1Gi

This manifest sets up a single replica, pulls the latest image, enables a basic NGINX ingress, and configures development-specific logging and resource requests.

Production-Ready Setup with Multi-Provider LLM Support

For a robust production environment, integrating multiple AI providers and advanced security:

apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: nlweb-prod
  namespace: nlweb
spec:
  releaseName: nlweb
  targetNamespace: nlweb
  chart:
    spec:
      chart: nlweb
      version: ">=1.1.0"
      sourceRef:
        kind: HelmRepository
        name: iunera-helm-charts
        namespace: helmrepos
  interval: 1m0s
  install:
    createNamespace: false
  upgrade:
    remediation:
      retries: 3
  values:
    replicaCount: 3
    image:
      repository: iunera/nlweb
      tag: "1.2.4"
      pullPolicy: IfNotPresent

    env:
      - name: NLWEB_LOGGING_PROFILE
        value: production
      - name: AZURE_VECTOR_SEARCH_ENDPOINT
        value: "https://nlweb-prod.search.windows.net"
      - name: AZURE_VECTOR_SEARCH_API_KEY
        valueFrom:
          secretKeyRef:
            name: nlweb-azure-secrets
            key: vector-search-key
      - name: OPENAI_API_KEY
        valueFrom:
          secretKeyRef:
            name: nlweb-openai-secrets
            key: api-key
      - name: ANTHROPIC_API_KEY
        valueFrom:
          secretKeyRef:
            name: nlweb-anthropic-secrets
            key: api-key
      - name: AZURE_OPENAI_API_KEY
        valueFrom:
          secretKeyRef:
            name: nlweb-azure-openai-secrets
            key: api-key

    ingress:
      enabled: true
      annotations:
        kubernetes.io/ingress.class: nginx
        kubernetes.io/tls-acme: "true"
        cert-manager.io/cluster-issuer: letsencrypt-prod
        nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
        nginx.ingress.kubernetes.io/enable-modsecurity: "true"
        nginx.ingress.kubernetes.io/enable-owasp-core-rules: "true"
        nginx.ingress.kubernetes.io/rate-limit: "100"
        nginx.ingress.kubernetes.io/rate-limit-window: "1m"
      hosts:
        - host: nlweb.example.com
          paths:
            - path: /
              pathType: ImplementationSpecific
      tls:
        - secretName: nlweb-tls
          hosts:
            - nlweb.example.com

    resources:
      requests:
        cpu: 200m
        memory: 1Gi
      limits:
        cpu: 1000m
        memory: 2Gi

    autoscaling:
      enabled: true
      minReplicas: 3
      maxReplicas: 10
      targetCPUUtilizationPercentage: 70
      targetMemoryUtilizationPercentage: 80

This production example demonstrates increased replicaCount, specific image tags, environment variables sourced from Kubernetes Secrets for API keys, and comprehensive ingress configuration including TLS, Cert-Manager, ModSecurity, OWASP rules, and rate limiting. It also enables Horizontal Pod Autoscaling based on CPU and memory usage.

Comprehensive ConfigMap Customization Examples

NLWeb's flexibility truly shines when using ConfigMaps to manage its detailed configurations.

Web Server Configuration for Different Environments

Development Environment ConfigMap:

volumes:
  configMaps:
    - name: nlweb-dev-config
      mountPath: /app/config
      data:
        config_webserver.yaml: |-
          port: 8000
          static_directory: ../../
          mode: development

          server:
            host: 0.0.0.0
            enable_cors: true
            cors_trusted_origins: "*"  # Allow all origins in dev
            max_connections: 50
            timeout: 60

            logging:
              level: debug
              file: ./logs/webserver.log
              console: true

            static:
              enable_cache: false  # Disable caching in dev
              gzip_enabled: false

Production Environment ConfigMap:

volumes:
  configMaps:
    - name: nlweb-prod-config
      mountPath: /app/config
      data:
        config_webserver.yaml: |-
          port: 8000
          static_directory: ../../
          mode: production

          server:
            host: 0.0.0.0
            enable_cors: true
            cors_trusted_origins:
              - https://nlweb.example.com
              - https://api.example.com
              - https://admin.example.com
            max_connections: 200
            timeout: 30

            ssl:
              enabled: true
              cert_file_env: SSL_CERT_FILE
              key_file_env: SSL_KEY_FILE

            logging:
              level: info
              file: ./logs/webserver.log
              console: false
              rotation:
                max_size: 100MB
                max_files: 10

            static:
              enable_cache: true
              cache_max_age: 86400  # 24 hours
              gzip_enabled: true
              compression_level: 6

These examples showcase how ConfigMaps allow for granular control over NLWeb's web server behavior, adjusting settings like CORS policies, logging levels, SSL, and caching for different environments.

Multi-Provider LLM Configuration

An enterprise LLM setup with fallback providers, ensuring resilience and cost optimization:

volumes:
  configMaps:
    - name: nlweb-llm-config
      mountPath: /app/config
      data:
        config_llm.yaml: |-
          preferred_endpoint: azure_openai
          fallback_strategy: round_robin

          endpoints:
            azure_openai:
              api_key_env: AZURE_OPENAI_API_KEY
              api_endpoint_env: AZURE_OPENAI_ENDPOINT
              api_version_env: "2024-12-01-preview"
              llm_type: azure_openai
              models:
                high: gpt-4o
                low: gpt-4o-mini
              rate_limits:
                requests_per_minute: 1000
                tokens_per_minute: 150000
              retry_config:
                max_retries: 3
                backoff_factor: 2

            openai:
              api_key_env: OPENAI_API_KEY
              api_endpoint_env: OPENAI_ENDPOINT
              llm_type: openai
              models:
                high: gpt-4-turbo
                low: gpt-3.5-turbo
              rate_limits:
                requests_per_minute: 500
                tokens_per_minute: 90000

            anthropic:
              api_key_env: ANTHROPIC_API_KEY
              llm_type: anthropic
              models:
                high: claude-3-opus-20240229
                low: claude-3-haiku-20240307
              rate_limits:
                requests_per_minute: 300
                tokens_per_minute: 60000

            gemini:
              api_key_env: GCP_PROJECT
              llm_type: gemini
              models:
                high: gemini-1.5-pro
                low: gemini-1.5-flash
              rate_limits:
                requests_per_minute: 200
                tokens_per_minute: 40000

This robust ConfigMap illustrates how to define multiple LLM endpoints, their models, API keys (referenced via environment variables), rate limits, and fallback strategies. This is crucial for building resilient AI applications.

Embedding Provider Configuration for Vector Search

Configuring multiple embedding providers, essential for robust RAG architectures:

volumes:
  configMaps:
    - name: nlweb-embedding-config
      mountPath: /app/config
      data:
        config_embedding.yaml: |-
          preferred_provider: azure_openai
          fallback_providers:
            - openai
            - snowflake

          providers:
            azure_openai:
              api_key_env: AZURE_OPENAI_API_KEY
              api_endpoint_env: AZURE_OPENAI_ENDPOINT
              api_version_env: "2024-10-21"
              model: text-embedding-3-large
              dimensions: 3072
              batch_size: 100
              rate_limits:
                requests_per_minute: 1000

            openai:
              api_key_env: OPENAI_API_KEY
              api_endpoint_env: OPENAI_ENDPOINT
              model: text-embedding-3-large
              dimensions: 3072
              batch_size: 100
              rate_limits:
                requests_per_minute: 500

            snowflake:
              api_key_env: SNOWFLAKE_PAT
              api_endpoint_env: SNOWFLAKE_ACCOUNT_URL
              api_version_env: "2024-10-01"
              model: snowflake-arctic-embed-l
              dimensions: 1024
              batch_size: 50
              rate_limits:
                requests_per_minute: 200

            huggingface:
              api_key_env: HF_TOKEN
              model: sentence-transformers/all-mpnet-base-v2
              dimensions: 768
              local_inference: true
              device: cpu

This ConfigMap demonstrates how NLWeb can be configured to use a primary embedding provider (e.g., Azure OpenAI) with defined fallbacks (OpenAI, Snowflake, Hugging Face), specifying models, dimensions, batch sizes, and rate limits for each. This ensures reliable and performant vector generation for search and RAG.

Performance Optimization Configuration

Optimizing for high performance and efficient resource usage:

volumes:
  configMaps:
    - name: nlweb-performance-config
      mountPath: /app/config
      data:
        config_llm_performance.yaml: |-
          # LLM Performance Settings
          representation:
            use_compact: true
            limit: 10
            include_metadata: true

          cache:
            enable: true
            max_size: 10000
            ttl: 3600  # 1 hour
            include_schema: true
            include_provider: true
            include_model: true
            include_user_context: false
            compression: gzip

          rate_limiting:
            enable: true
            requests_per_minute: 1000
            burst_size: 100
            per_user_limit: 50

          monitoring:
            enable_metrics: true
            metrics_port: 9090
            health_check_interval: 30
            performance_logging: true

This ConfigMap shows how to fine-tune caching parameters (size, TTL, key components, compression) and implement rate limiting, critical for managing LLM API costs and ensuring service stability. It also includes monitoring settings, enabling robust observability.

Environment-Specific Volume Configurations

Managing persistent storage and configuration mounting for different environments:

Development with Hot Reloading:

volumes:
  enabled: true
  emptyDirs:
    - name: data
      mountPath: /app/data
    - name: logs
      mountPath: /app/logs
    - name: tmp
      mountPath: /tmp
    - name: cache
      mountPath: /app/cache

  # Development: Use hostPath for easy file access
  hostPaths:
    - name: dev-config
      hostPath: /local/dev/nlweb/config
      mountPath: /app/config
      type: DirectoryOrCreate

Production with Persistent Storage:

volumes:
  enabled: true
  emptyDirs:
    - name: tmp
      mountPath: /tmp
      sizeLimit: 1Gi

  pvc:
    enabled: true
    storageClass: fast-ssd
    size: 50Gi
    accessMode: ReadWriteOnce
    mountPath: /app/data

  # Production: Use ConfigMaps for configuration
  configMaps:
    - name: nlweb-prod-config
      mountPath: /app/config
    - name: nlweb-llm-config
      mountPath: /app/config/llm
    - name: nlweb-embedding-config
      mountPath: /app/config/embedding

  # Production: Use Secrets for sensitive data
  existingSecrets:
    - name: nlweb-api-keys
      mountPath: /app/secrets
      defaultMode: 0400

These examples illustrate how to dynamically manage volumes, using hostPath for local development convenience and Persistent Volume Claims (PVCs) for robust, scalable storage in production, alongside mounting ConfigMaps and Secrets for secure configuration management.

Getting Started: Step-by-Step Helm Installation Guide

Ready to get your hands dirty? Here's how to begin deploying NLWeb in your Kubernetes cluster.

Prerequisites Setup

Before deploying NLWeb, ensure you have the following prerequisites in place:

Add the Iunera Helm Repository:

helm repo add iunera https://iunera.github.io/helm-charts/
helm repo update

Create Namespace and Secrets:

# Create namespace
kubectl create namespace nlweb

# Create secrets for API keys (replace with your actual keys)
kubectl create secret generic nlweb-openai-secrets \
  --from-literal=api-key="your-openai-api-key" \
  -n nlweb

kubectl create secret generic nlweb-azure-secrets \
  --from-literal=vector-search-key="your-azure-search-key" \
  --from-literal=openai-api-key="your-azure-openai-key" \
  -n nlweb

Install with Custom Values:

After preparing your nlweb-values.yaml file (using the advanced examples above as a starting point, tailored to your needs), you would apply it via FluxCD by committing it to your Git repository. For a direct Helm install (e.g., in a dev environment), you'd run:
```
helm install nlweb iunera/nlweb -n nlweb -f nlweb-values.yaml
```

This foundational setup, combined with the GitOps workflow, will get your NLWeb instance operational and ready to serve intelligent web experiences.

Conclusion

The landscape of AI-powered web applications is rapidly transforming, and NLWeb stands at the forefront, offering a unique blend of intelligence, flexibility, and cloud-native design. By embracing GitOps methodologies with FluxCD and leveraging the robust capabilities of Kubernetes and Azure, developers can create a truly production-ready ecosystem for NLWeb. This approach ensures automated, consistent, and auditable deployments, allowing teams to focus on innovation rather than operational overhead.

Whether you're building a new AI-driven knowledge hub or transforming existing web experiences, understanding and implementing these modern DevOps practices for NLWeb will be crucial for success. The combination of open-source innovation, declarative operations, and powerful cloud infrastructure provides an unstoppable force for the future of intelligent web applications. For further exploration into related fields, consider iunera.com's expertise in Apache Druid AI Consulting Europe or their advancements in Enterprise MCP Server Development.

We encourage you to dive deeper, experiment with NLWeb, and contribute to its evolving journey. The future of intelligent web is here, and it's powered by code, AI, and GitOps.

DEV Community