The digital frontier is constantly expanding, and at its bleeding edge sits the convergence of artificial intelligence and web applications. Imagine websites that don't just display information but actively understand, process, and respond to user queries with intelligent, contextual awareness. This isn't science fiction; it's the promise of NLWeb, Microsoft's innovative open-source protocol. NLWeb is transforming traditional websites into dynamic, AI-driven knowledge hubs, capable of seamless integration with vector databases, multiple Large Language Model (LLM) providers, and diverse enterprise data sources.
But building such revolutionary applications is only half the battle. Deploying and managing them efficiently, reliably, and scalably in a production environment is the other, equally critical half. This is where Kubernetes, the undisputed champion of container orchestration, and GitOps, the gold standard for declarative infrastructure management, come into play. When you combine NLWeb's intelligent capabilities with the robust operational kraken that is Kubernetes, powered by continuous deployment via FluxCD, you unlock a powerful, production-ready ecosystem.
In this comprehensive guide, originally inspired by an insightful article on iunera.com about NLWeb Deployment in Kubernetes GitOps Style with FluxCD, we'll explore the intricacies of deploying NLWeb using modern DevOps practices. We'll dive into leveraging FluxCD for automated continuous deployment and touch upon Azure's robust cloud infrastructure for cloud-native AI applications.
NLWeb: Revolutionizing AI Web Applications
NLWeb isn't just another framework; it represents a fundamental shift in how we conceive and interact with web applications. Unlike the static pages or even dynamic, CRUD-based applications we're accustomed to, NLWeb empowers AI-powered websites to truly understand and engage. It acts as an intelligent layer, enabling natural language understanding and delivering contextually relevant responses, making web experiences profoundly more interactive.
At its core, NLWeb's architecture embraces modern cloud-native principles and adheres to CNCF best practices. This ensures not only scalability but also resilience and flexibility. A key design philosophy is its multi-provider support, integrating seamlessly with various AI providers for embedding and LLM functionalities. This means you're not locked into a single vendor; NLWeb supports embedding providers like OpenAI, Azure OpenAI, Gemini, and Snowflake, and offers flexible LLM integration with providers ranging from Anthropic's Claude AI assistant to various Hugging Face models. This multi-provider approach is a game-changer, fostering resilience, allowing for cost optimization, and enabling experimentation with cutting-edge models without vendor lock-in.
While incredibly powerful, it's worth noting that NLWeb is currently in its early stages of development. However, its design prioritizes ease of use and production deployment considerations, and the community actively contributes bug fixes and enhancements to accelerate its journey towards enterprise readiness. For a deeper dive into how NLWeb processes queries, check out NLWeb's AI Demystified: How an Example Query is Processed in NLWeb.
Why GitOps with FluxCD for NLWeb Deployments?
In the world of Kubernetes, GitOps has emerged as the gold standard for managing deployments. It's a declarative infrastructure management methodology where Git repositories serve as the single source of truth for all infrastructure and application configurations. This approach brings unparalleled levels of automation, auditability, and reliability, making it a perfect fit for NLWeb's cloud-native architecture.
The Core Principles of GitOps:
- Declarative Infrastructure: Your desired state for the entire system (applications, infrastructure, configurations) is described in Git, using manifests like YAML files and Helm charts.
- Version Control: Every change to your infrastructure is a Git commit, providing a complete, auditable history and easy rollback capabilities.
- Automated Deployments: A specialized operator (like FluxCD) continuously monitors the Git repository. When changes are detected, it automatically applies them to the Kubernetes cluster, ensuring that the cluster's actual state converges with the desired state defined in Git.
- Consistency: The same deployment process is used across all environments – development, staging, and production – eliminating configuration drift and manual errors.
For NLWeb deployments, iunera.com provides production-ready Helm charts that encapsulate years of operational experience and best practices. These charts streamline the deployment process, making it straightforward to get NLWeb up and running consistently across various Kubernetes environments.
FluxCD acts as the vigilant GitOps operator in this scenario. It continuously monitors the specified Git repository for changes to your application and infrastructure definitions. Upon detecting a change, FluxCD automatically synchronizes your Kubernetes cluster to reflect that desired state. This eliminates configuration drift, drastically reduces the need for manual intervention, and provides a clear, immutable audit trail of every change applied to your system. The benefits for NLWeb are clear: robust, automated, and auditable deployments.
Diving Deep: NLWeb's Kubernetes Architecture
Deploying NLWeb on Kubernetes involves orchestrating several key components that collaborate to deliver those intelligent AI-powered web experiences. Let's break down the technical architecture.
Core Components and Configuration
At its heart, the NLWeb application is a Python-based service. It's typically packaged into a Docker image, such as iunera/nlweb, and configured to serve on port 8000. Crucially, it includes comprehensive health checks (liveness and readiness probes) to ensure the application is not only running but also capable of serving requests.
NLWeb employs a sophisticated configuration system, leveraging multiple YAML files to manage distinct aspects of its behavior:
-
config_webserver.yaml: Controls server settings, CORS policies, SSL configuration, and static file serving. -
config_llm.yaml: Manages Large Language Model (LLM) provider configurations and model selections. -
config_embedding.yaml: Defines embedding provider settings and model preferences. -
config_llm_performance.yaml: Optimizes application performance through caching, rate limiting, and response management.
Security Context and Best Practices
In a production Kubernetes environment, security is paramount. NLWeb's deployment adheres to Kubernetes pod security standards and best practices, including:
- Non-root user execution (UID 999): Minimizes the impact of potential container breakouts.
- Read-only root filesystem: Prevents malicious processes from modifying critical system files.
- Dropped capabilities: Removes unnecessary Linux capabilities, further hardening the container.
- Security contexts: Applied at both the pod and container levels for fine-grained access control.
This robust architecture provides a secure, scalable, and maintainable foundation for NLWeb, enabling it to thrive in demanding production environments.
Helm Chart Structure and Values
The NLWeb Helm chart is designed for extensive customization, allowing developers to tailor deployments to specific needs through its values.yaml configuration. A basic example showcasing some core values might look like this:
replicaCount: 1
image:
repository: iunera/nlweb
pullPolicy: IfNotPresent
service:
type: ClusterIP
port: 8000
env:
- name: PYTHONPATH
value: "/app"
- name: PORT
value: "8000"
- name: NLWEB_LOGGING_PROFILE
value: production
Beyond these basic settings, the chart supports a rich array of advanced features, essential for production deployments:
- Autoscaling: Configuration for Horizontal Pod Autoscaler (HPA) with CPU-based scaling to dynamically adjust replica counts.
- Ingress: Integration with NGINX ingress controllers, including SSL/TLS termination for secure external access.
- Volumes: Support for Persistent Volume Claims (PVCs), ConfigMaps, and EmptyDir volumes to manage data and configuration.
- ConfigMaps: Detailed mechanisms to configure NLWeb's various settings (LLM, vector endpoints, etc.) directly from Kubernetes ConfigMaps.
- Security: Further enforcement of pod security contexts and network policies to isolate and protect the application.
FluxCD in Action: Automating NLWeb Deployments
FluxCD is more than just a deployment tool; it's a continuous delivery solution for Kubernetes that empowers GitOps. It acts as the bridge between your Git repository and your Kubernetes cluster, ensuring that any changes committed to your manifest files are automatically, consistently, and reliably applied.
The HelmRelease Controller
Central to FluxCD's GitOps approach for NLWeb is the HelmRelease custom resource. This powerful Custom Resource Definition (CRD) manages the entire lifecycle of a Helm chart deployment. Here's a typical HelmRelease configuration for NLWeb:
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
name: nlweb
namespace: nlweb
spec:
releaseName: nlweb
targetNamespace: nlweb
chart:
spec:
chart: nlweb
version: ">=1.1.0"
sourceRef:
kind: HelmRepository
name: iunera-helm-charts
namespace: helmrepos
interval: 1m0s
This manifest instructs FluxCD to continuously monitor the iunera-helm-charts repository. The interval: 1m0s setting ensures FluxCD checks for updates every minute, providing near real-time deployment capabilities. When a new chart version or configuration change is detected in the Git repository, FluxCD automatically performs an upgrade on the NLWeb release in the cluster.
Image Automation and Version Management
Keeping container images up-to-date manually is tedious and error-prone. FluxCD's image automation capabilities elegantly solve this problem for NLWeb deployments. The system can automatically detect new container image versions published to a registry and update the corresponding deployment manifests in Git. This is invaluable for maintaining up-to-date deployments while integrating with proper testing and validation workflows.
To enable this, NLWeb deployments leverage special annotations within the HelmRelease manifest, as shown below:
image:
repository: iunera/nlweb # {"$imagepolicy": "flux-system:nlweb:name"}
tag: 1.2.4 # {"$imagepolicy": "flux-system:nlweb:tag"}
These annotations serve as directives for FluxCD, telling it to automatically update the image repository and tag values based on an ImagePolicy defined elsewhere. When a new image version is detected that matches the policy criteria, FluxCD not only updates the manifest but also commits these changes back to the Git repository, maintaining the single source of truth.
Image Repository and Policy Configuration
The image automation magic is configured through two key resources, often defined in a file like nlweb.imagerepo.yaml:
# ImageRepository defines the Docker image repository to monitor
apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImageRepository
metadata:
name: nlweb
namespace: flux-system
spec:
image: iunera/nlweb
interval: 10m
secretRef:
name: iunera
---
# ImagePolicy defines which image versions to select
apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImagePolicy
metadata:
name: nlweb
namespace: flux-system
spec:
imageRepositoryRef:
name: nlweb
policy:
semver:
range: ">=1.0.0"
The ImageRepository resource specifies the Docker image to monitor (iunera/nlweb), how frequently to check for new versions (interval: 10m), and authentication credentials via secretRef for private registries. The ImagePolicy resource, on the other hand, defines the selection criteria for image versions using semantic versioning, ensuring only compatible updates (e.g., >=1.0.0) are applied.
Automation Workflow
The entire image update automation workflow is orchestrated by the ImageUpdateAutomation resource:
apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImageUpdateAutomation
metadata:
name: flux-system
namespace: flux-system
spec:
git:
checkout:
ref:
branch: master
commit:
author:
email: fluxcdbot@nodomain.local
name: fluxcdbot
messageTemplate: |
Automated image update
Automation name: {{ .AutomationObject }}
Files:
{{ range $filename, $_ := .Changed.FileChanges -}}
- {{ $filename }}
{{ end -}}
Objects:
{{ range $resource, $changes := .Changed.Objects -}}
- {{ $resource.Kind }} {{ $resource.Name }}
Changes:
{{- range $_, $change := $changes }}
- {{ $change.OldValue }} -> {{ $change.NewValue }}
{{ end -}}
{{ end -}}
push:
branch: master
interval: 30m0s
sourceRef:
kind: GitRepository
name: flux-system
update:
path: ./kubernetes/common
strategy: Setters
This resource ensures that FluxCD:
- Checks out the
masterbranch of your Git repository. - Commits changes with a descriptive message template that details what was updated.
- Pushes these changes back to the
masterbranch. - Executes this process every 30 minutes.
- Applies updates within the
./kubernetes/commonpath, using the "Setters" strategy to look for image policy annotations.
Through this sophisticated configuration, your NLWeb deployment automatically stays current with the latest compatible container images, all without manual intervention. This entire process is meticulously tracked in your Git history, providing a complete audit trail and easy rollback capabilities. It's a testament to the power of declarative, automated operations.
Building & Deploying: The CI/CD Pipeline
The journey of NLWeb from source code to a running application in Kubernetes is facilitated by a robust CI/CD pipeline, integrating Docker builds with the GitOps workflow powered by FluxCD.
Dockerfile Structure and Multi-Stage Build
An efficient and secure container image is foundational to any cloud-native application. NLWeb leverages a Docker multi-stage build process to achieve this, separating build dependencies from the final runtime environment. This results in smaller, more secure images.
# Stage 1: Build stage
FROM python:3.13-slim AS builder
# Install build dependencies
RUN apt-get update && \
apt-get install -y --no-install-recommends gcc python3-dev && \
pip install --no-cache-dir --upgrade pip && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
WORKDIR /app
# Copy requirements file
COPY code/requirements.txt .
# Install Python packages
RUN pip install --no-cache-dir -r requirements.txt
# Copy requirements file
COPY docker_requirements.txt .
# Install Python packages
RUN pip install --no-cache-dir -r docker_requirements.txt
# Stage 2: Runtime stage
FROM python:3.13-slim
# Apply security updates
RUN apt-get update && \
apt-get install -y --no-install-recommends --only-upgrade \
$(apt-get --just-print upgrade | grep "^Inst" | grep -i securi | awk '{print $2}') && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
WORKDIR /app
# Create a non-root user and set permissions
RUN groupadd -r nlweb && \
useradd -r -g nlweb -d /app -s /bin/bash nlweb && \
chown -R nlweb:nlweb /app
USER nlweb
# Copy application code
COPY code/ /app/
COPY static/ /app/static/
# Copy installed packages from builder stage
COPY --from=builder /usr/local/lib/python3.13/site-packages /usr/local/lib/python3.13/site-packages
COPY --from=builder /usr/local/bin /usr/local/bin
# Expose the port the app runs on
EXPOSE 8000
# Set environment variables
ENV NLWEB_OUTPUT_DIR=/app
ENV PYTHONPATH=/app
ENV PORT=8000
ENV VERSION=1.2.4
# Command to run the application
CMD ["python", "app-file.py"]
Key aspects of this Dockerfile include:
- Stage 1 (Builder): Installs all necessary build-time dependencies, ensuring the final image is lean.
- Stage 2 (Runtime): Creates a minimal, secure runtime environment, applying security updates and removing build tools.
- Security Features: Enforces non-root user execution (
USER nlweb), a best practice for container security, and ensures minimal dependencies are present. - Version Definition: The
ENV VERSIONvariable is critical for consistent image tagging.
GitHub Actions Workflow
When changes are pushed to the iuneracustomizations branch and the Dockerfile is modified, a GitHub Actions CI/CD automation workflow, typically defined in .github/workflows/prod-build.yml, is triggered. This workflow orchestrates the build and push process for the NLWeb Docker image.
name: prod-build
on:
push:
branches:
- iuneracustomizations
paths:
- Dockerfile
jobs:
build-and-push:
runs-on: ubuntu-latest
steps:
- name: Checkout Repository
uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Log in to Private Registry
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Extract Version from Dockerfile
id: extract_version
run: |
# Extract the VERSION from Dockerfile
VERSION=$(grep "ENV VERSION=" Dockerfile | cut -d= -f2)
echo "VERSION=${VERSION}" >> $GITHUB_ENV
echo "Using version from Dockerfile: ${VERSION}"
- name: Build the Docker image
run: |
docker build -t iunera/nlweb:latest -t iunera/nlweb:${{ env.VERSION }} .
docker push iunera/nlweb:latest
docker push iunera/nlweb:${{ env.VERSION }}
echo "Built and pushed Docker image with tags: latest, ${{ env.VERSION }}"
- name: Inspect
run: |
docker image inspect iunera/nlweb:latest
- name: Create and Push Git Tag
run: |
git config --global user.name "GitHub Actions"
git config --global user.email "actions@github.com"
git tag -a v${{ env.VERSION }} -m "Release version ${{ env.VERSION }}"
git push origin v${{ env.VERSION }}
This workflow handles crucial steps:
- Checkout Repository: Clones the codebase.
- Set up Docker Buildx & QEMU: Configures Docker for multi-architecture builds (ARM64, AMD64), essential for modern cloud environments.
- Log in to Private Registry: Authenticates with Docker Hub (or any other registry) using GitHub Secrets.
- Extract Version: Dynamically parses the
Dockerfileto get theVERSIONenvironment variable, ensuring consistency. - Build and Push: Builds the Docker image and tags it with both
latestand the extracted version number, pushing both to Docker Hub. - Create Git Tag: Creates and pushes a Git tag, linking the code version to the deployed image.
Complete CI/CD to Deployment Flow
This intricate pipeline connects development to deployment seamlessly:
- Development: A developer modifies the Dockerfile (perhaps updating the
VERSION) or application code. - CI/CD: GitHub Actions automatically builds and pushes the new Docker image to Docker Hub.
- Automation: FluxCD, continuously monitoring Docker Hub, detects the new image version.
- GitOps: FluxCD updates the Kubernetes manifests in the Git repository with the new image version and commits these changes.
- Deployment: FluxCD, observing the Git repository, applies these changes to the Kubernetes cluster, deploying new pods with the updated NLWeb image.
This GitOps approach ensures your Git repository is always the single source of truth, all changes are auditable, deployments are automated and consistent, and rollbacks are straightforward and reliable.
Local Development Environment
While GitHub Actions handles production builds, local development for NLWeb is streamlined using Docker Compose, providing a consistent environment:
services:
nlweb:
build:
context: .
dockerfile: Dockerfile
container_name: nlweb
ports:
- "8000:8000"
env_file:
- ./code/.env
environment:
- PYTHONPATH=/app
- PORT=8000
volumes:
- ./data:/data
- ./code/config:/app/config:ro
healthcheck:
test: ["CMD-SHELL", "python -c \"import urllib.request; urllib.request.urlopen('http://localhost:8000')\""]
interval: 30s
timeout: 10s
retries: 3
start_period: 10s
restart: unless-stopped
user: nlweb
This Docker Compose setup mirrors production by using the same Dockerfile, mounting local directories for data and configuration, loading environment variables, and including health checks, all while running as the secure, non-root nlweb user.
Cloud-Native with Azure: Powering AI Infrastructure
For organizations deeply embedded in Microsoft's cloud ecosystem, NLWeb offers exceptional synergy with Azure services. This native integration makes it an ideal choice for building scalable, enterprise-grade AI solutions.
NLWeb natively supports:
- Azure Cognitive Search: For robust vector search capabilities, NLWeb integrates with Azure's vector search service, enabling scalable and performant similarity searches across massive datasets. This is a critical component for Retrieval Augmented Generation (RAG) patterns in AI applications. For more on optimizing RAG, consider exploring Enterprise AI Excellence: How to do an Agentic Enterprise RAG and Polyglot Knowledge RAG Ingestion Concept for Enterprise Ready AIs.
- Azure OpenAI Service: Direct integration with Azure's enterprise-grade OpenAI offerings, including models like GPT-4 and its embedding models. This ensures high-performance AI capabilities with the added benefits of Azure's security, compliance, and governance features.
- Azure Container Registry (ACR): Seamless integration with ACR for secure container image management, storage, and vulnerability scanning, completing the cloud-native CI/CD loop.
Configuration for these Azure services is elegantly handled through Kubernetes environment variables and ConfigMaps, allowing for easy management across different environments while adhering to security best practices:
env:
- name: AZURE_VECTOR_SEARCH_ENDPOINT
value: "https://your-vector-search-db.search.windows.net"
- name: AZURE_OPENAI_ENDPOINT
value: "https://your-openai-instance.openai.azure.com/"
Achieving Production-Readiness with NLWeb
Beyond basic deployment, NLWeb is engineered with several production-ready features that make it suitable for demanding enterprise workloads.
Multi-Provider LLM Support
As highlighted earlier, NLWeb's multi-provider LLM support is a cornerstone of its design. It supports a diverse range of models from:
- OpenAI: GPT-4.1, GPT-4.1-mini, GPT-4-turbo, GPT-3.5-turbo.
- Anthropic: Claude-3-7-sonnet-latest, Claude-3-5-haiku-latest, Claude-3-opus-20240229.
- Azure OpenAI: Enterprise-grade models with Azure's security.
- Google Gemini:
chat-bisonmodels, Gemini 1.5 Pro, Gemini 1.5 Flash. - Snowflake: Arctic embedding models and Claude integration.
- Hugging Face: Various open-source models, including the Qwen2.5 series and
sentence-transformers/all-mpnet-base-v2.
This breadth of support allows organizations to strategically optimize costs by using different models for varying use cases, enhance service availability through provider redundancy, and experiment with cutting-edge models without being tethered to a single vendor. It also provides flexibility to integrate with custom models, potentially from an Enterprise MCP Server Development project.
Performance Optimization and Caching
To ensure NLWeb performs optimally and minimizes API costs (especially critical with LLM interactions), sophisticated caching mechanisms are implemented:
cache:
enable: true
max_size: 1000
ttl: 0 # No expiration
include_schema: true
include_provider: true
include_model: true
This caching system is intelligent, considering factors like schema, provider, and model when generating cache keys. This ensures high cache hit rates while maintaining the accuracy and relevance of AI responses. Further strategies are detailed in the config_llm_performance.yaml discussed in the next section.
Enterprise Data Integration
NLWeb's true power shines through its ability to seamlessly integrate with diverse enterprise data sources. Building on foundational concepts for data exposure (as discussed in Guide: Exposing Enterprise Data with Java and Spring for AI Indexing (for NLWeb)), NLWeb supports:
- JSON-LD and Schema.org: Facilitates structured data integration for enhanced semantic web capabilities, making data more understandable by AI. More on this can be found in How Markdown, JSON-LD and Schema.org Improve Vectorsearch RAGs and NLWeb.
- Vector Database Integration: Compatible with various vector databases, including Azure Cognitive Search, for efficient similarity search.
- Real-time Data Processing: Capabilities for stream processing to handle dynamic content updates.
- Enterprise Security: Features like role-based access control (RBAC) and data governance to protect sensitive information.
NLWeb GitOps vs. Traditional Approaches: A Comparison
To truly appreciate the value of NLWeb deployed with GitOps and FluxCD, it's helpful to contrast it with more traditional deployment methods:
- Scalability: GitOps-managed NLWeb on Kubernetes offers auto-scaling via Horizontal Pod Autoscalers (HPAs), dynamically adjusting to demand. Traditional methods often involve limited vertical scaling or cumbersome manual scaling.
- Deployment Speed: Automated deployments via GitOps mean changes are rapidly applied from commit to production. Manual or portal-based deployments are inherently slower and more error-prone.
- Configuration Management: Git-based versioning provides a complete history and single source of truth for NLWeb configurations. Traditional approaches often rely on portal-based settings or disparate file-based configurations across servers.
- Multi-environment Support: Kubernetes namespaces and Helm charts provide native support for consistent multi-environment deployments. Traditional methods require separate application instances or even entire server fleets.
- Rollback Capabilities: Git-based rollbacks are fast, reliable, and well-audited with GitOps. Other methods offer limited or manual rollback processes.
- Cost Optimization: Kubernetes allows for resource-based pricing and efficient utilization. Traditional infrastructure often incurs higher fixed infrastructure costs or less granular cloud billing.
- Monitoring & Observability: NLWeb on Kubernetes leverages robust Kubernetes-native tools and integrations with platforms like Azure Monitor. Traditional setups require custom monitoring configurations.
- Security: Kubernetes' pod security contexts combined with network policies provide strong security. Traditional Linux installs require extensive manual security hardening.
The iunera.com Helm charts specifically designed for NLWeb provide a significant advantage in this comparison, offering production-tested configurations that mitigate common deployment pitfalls and accelerate time to value.
Advanced Configuration: Real-World Examples
This section delves into practical, production-ready configuration examples, providing templates for your NLWeb deployments.
Complete Helm Installation Manifest Examples
Basic Development Setup
A minimal HelmRelease for a development environment:
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
name: nlweb-dev
namespace: nlweb-dev
spec:
releaseName: nlweb-dev
targetNamespace: nlweb-dev
chart:
spec:
chart: nlweb
version: ">=1.1.0"
sourceRef:
kind: HelmRepository
name: iunera-helm-charts
namespace: helmrepos
interval: 5m0s
install:
createNamespace: true
values:
replicaCount: 1
image:
repository: iunera/nlweb
tag: "latest"
pullPolicy: Always
env:
- name: NLWEB_LOGGING_PROFILE
value: development
- name: OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: nlweb-secrets
key: openai-api-key
ingress:
enabled: true
annotations:
kubernetes.io/ingress.class: nginx
hosts:
- host: nlweb-dev.local
paths:
- path: /
pathType: ImplementationSpecific
resources:
requests:
cpu: 100m
memory: 512Mi
limits:
cpu: 500m
memory: 1Gi
This manifest sets up a single replica, pulls the latest image, enables a basic NGINX ingress, and configures development-specific logging and resource requests.
Production-Ready Setup with Multi-Provider LLM Support
For a robust production environment, integrating multiple AI providers and advanced security:
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
name: nlweb-prod
namespace: nlweb
spec:
releaseName: nlweb
targetNamespace: nlweb
chart:
spec:
chart: nlweb
version: ">=1.1.0"
sourceRef:
kind: HelmRepository
name: iunera-helm-charts
namespace: helmrepos
interval: 1m0s
install:
createNamespace: false
upgrade:
remediation:
retries: 3
values:
replicaCount: 3
image:
repository: iunera/nlweb
tag: "1.2.4"
pullPolicy: IfNotPresent
env:
- name: NLWEB_LOGGING_PROFILE
value: production
- name: AZURE_VECTOR_SEARCH_ENDPOINT
value: "https://nlweb-prod.search.windows.net"
- name: AZURE_VECTOR_SEARCH_API_KEY
valueFrom:
secretKeyRef:
name: nlweb-azure-secrets
key: vector-search-key
- name: OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: nlweb-openai-secrets
key: api-key
- name: ANTHROPIC_API_KEY
valueFrom:
secretKeyRef:
name: nlweb-anthropic-secrets
key: api-key
- name: AZURE_OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: nlweb-azure-openai-secrets
key: api-key
ingress:
enabled: true
annotations:
kubernetes.io/ingress.class: nginx
kubernetes.io/tls-acme: "true"
cert-manager.io/cluster-issuer: letsencrypt-prod
nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
nginx.ingress.kubernetes.io/enable-modsecurity: "true"
nginx.ingress.kubernetes.io/enable-owasp-core-rules: "true"
nginx.ingress.kubernetes.io/rate-limit: "100"
nginx.ingress.kubernetes.io/rate-limit-window: "1m"
hosts:
- host: nlweb.example.com
paths:
- path: /
pathType: ImplementationSpecific
tls:
- secretName: nlweb-tls
hosts:
- nlweb.example.com
resources:
requests:
cpu: 200m
memory: 1Gi
limits:
cpu: 1000m
memory: 2Gi
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 10
targetCPUUtilizationPercentage: 70
targetMemoryUtilizationPercentage: 80
This production example demonstrates increased replicaCount, specific image tags, environment variables sourced from Kubernetes Secrets for API keys, and comprehensive ingress configuration including TLS, Cert-Manager, ModSecurity, OWASP rules, and rate limiting. It also enables Horizontal Pod Autoscaling based on CPU and memory usage.
Comprehensive ConfigMap Customization Examples
NLWeb's flexibility truly shines when using ConfigMaps to manage its detailed configurations.
Web Server Configuration for Different Environments
Development Environment ConfigMap:
volumes:
configMaps:
- name: nlweb-dev-config
mountPath: /app/config
data:
config_webserver.yaml: |-
port: 8000
static_directory: ../../
mode: development
server:
host: 0.0.0.0
enable_cors: true
cors_trusted_origins: "*" # Allow all origins in dev
max_connections: 50
timeout: 60
logging:
level: debug
file: ./logs/webserver.log
console: true
static:
enable_cache: false # Disable caching in dev
gzip_enabled: false
Production Environment ConfigMap:
volumes:
configMaps:
- name: nlweb-prod-config
mountPath: /app/config
data:
config_webserver.yaml: |-
port: 8000
static_directory: ../../
mode: production
server:
host: 0.0.0.0
enable_cors: true
cors_trusted_origins:
- https://nlweb.example.com
- https://api.example.com
- https://admin.example.com
max_connections: 200
timeout: 30
ssl:
enabled: true
cert_file_env: SSL_CERT_FILE
key_file_env: SSL_KEY_FILE
logging:
level: info
file: ./logs/webserver.log
console: false
rotation:
max_size: 100MB
max_files: 10
static:
enable_cache: true
cache_max_age: 86400 # 24 hours
gzip_enabled: true
compression_level: 6
These examples showcase how ConfigMaps allow for granular control over NLWeb's web server behavior, adjusting settings like CORS policies, logging levels, SSL, and caching for different environments.
Multi-Provider LLM Configuration
An enterprise LLM setup with fallback providers, ensuring resilience and cost optimization:
volumes:
configMaps:
- name: nlweb-llm-config
mountPath: /app/config
data:
config_llm.yaml: |-
preferred_endpoint: azure_openai
fallback_strategy: round_robin
endpoints:
azure_openai:
api_key_env: AZURE_OPENAI_API_KEY
api_endpoint_env: AZURE_OPENAI_ENDPOINT
api_version_env: "2024-12-01-preview"
llm_type: azure_openai
models:
high: gpt-4o
low: gpt-4o-mini
rate_limits:
requests_per_minute: 1000
tokens_per_minute: 150000
retry_config:
max_retries: 3
backoff_factor: 2
openai:
api_key_env: OPENAI_API_KEY
api_endpoint_env: OPENAI_ENDPOINT
llm_type: openai
models:
high: gpt-4-turbo
low: gpt-3.5-turbo
rate_limits:
requests_per_minute: 500
tokens_per_minute: 90000
anthropic:
api_key_env: ANTHROPIC_API_KEY
llm_type: anthropic
models:
high: claude-3-opus-20240229
low: claude-3-haiku-20240307
rate_limits:
requests_per_minute: 300
tokens_per_minute: 60000
gemini:
api_key_env: GCP_PROJECT
llm_type: gemini
models:
high: gemini-1.5-pro
low: gemini-1.5-flash
rate_limits:
requests_per_minute: 200
tokens_per_minute: 40000
This robust ConfigMap illustrates how to define multiple LLM endpoints, their models, API keys (referenced via environment variables), rate limits, and fallback strategies. This is crucial for building resilient AI applications.
Embedding Provider Configuration for Vector Search
Configuring multiple embedding providers, essential for robust RAG architectures:
volumes:
configMaps:
- name: nlweb-embedding-config
mountPath: /app/config
data:
config_embedding.yaml: |-
preferred_provider: azure_openai
fallback_providers:
- openai
- snowflake
providers:
azure_openai:
api_key_env: AZURE_OPENAI_API_KEY
api_endpoint_env: AZURE_OPENAI_ENDPOINT
api_version_env: "2024-10-21"
model: text-embedding-3-large
dimensions: 3072
batch_size: 100
rate_limits:
requests_per_minute: 1000
openai:
api_key_env: OPENAI_API_KEY
api_endpoint_env: OPENAI_ENDPOINT
model: text-embedding-3-large
dimensions: 3072
batch_size: 100
rate_limits:
requests_per_minute: 500
snowflake:
api_key_env: SNOWFLAKE_PAT
api_endpoint_env: SNOWFLAKE_ACCOUNT_URL
api_version_env: "2024-10-01"
model: snowflake-arctic-embed-l
dimensions: 1024
batch_size: 50
rate_limits:
requests_per_minute: 200
huggingface:
api_key_env: HF_TOKEN
model: sentence-transformers/all-mpnet-base-v2
dimensions: 768
local_inference: true
device: cpu
This ConfigMap demonstrates how NLWeb can be configured to use a primary embedding provider (e.g., Azure OpenAI) with defined fallbacks (OpenAI, Snowflake, Hugging Face), specifying models, dimensions, batch sizes, and rate limits for each. This ensures reliable and performant vector generation for search and RAG.
Performance Optimization Configuration
Optimizing for high performance and efficient resource usage:
volumes:
configMaps:
- name: nlweb-performance-config
mountPath: /app/config
data:
config_llm_performance.yaml: |-
# LLM Performance Settings
representation:
use_compact: true
limit: 10
include_metadata: true
cache:
enable: true
max_size: 10000
ttl: 3600 # 1 hour
include_schema: true
include_provider: true
include_model: true
include_user_context: false
compression: gzip
rate_limiting:
enable: true
requests_per_minute: 1000
burst_size: 100
per_user_limit: 50
monitoring:
enable_metrics: true
metrics_port: 9090
health_check_interval: 30
performance_logging: true
This ConfigMap shows how to fine-tune caching parameters (size, TTL, key components, compression) and implement rate limiting, critical for managing LLM API costs and ensuring service stability. It also includes monitoring settings, enabling robust observability.
Environment-Specific Volume Configurations
Managing persistent storage and configuration mounting for different environments:
Development with Hot Reloading:
volumes:
enabled: true
emptyDirs:
- name: data
mountPath: /app/data
- name: logs
mountPath: /app/logs
- name: tmp
mountPath: /tmp
- name: cache
mountPath: /app/cache
# Development: Use hostPath for easy file access
hostPaths:
- name: dev-config
hostPath: /local/dev/nlweb/config
mountPath: /app/config
type: DirectoryOrCreate
Production with Persistent Storage:
volumes:
enabled: true
emptyDirs:
- name: tmp
mountPath: /tmp
sizeLimit: 1Gi
pvc:
enabled: true
storageClass: fast-ssd
size: 50Gi
accessMode: ReadWriteOnce
mountPath: /app/data
# Production: Use ConfigMaps for configuration
configMaps:
- name: nlweb-prod-config
mountPath: /app/config
- name: nlweb-llm-config
mountPath: /app/config/llm
- name: nlweb-embedding-config
mountPath: /app/config/embedding
# Production: Use Secrets for sensitive data
existingSecrets:
- name: nlweb-api-keys
mountPath: /app/secrets
defaultMode: 0400
These examples illustrate how to dynamically manage volumes, using hostPath for local development convenience and Persistent Volume Claims (PVCs) for robust, scalable storage in production, alongside mounting ConfigMaps and Secrets for secure configuration management.
Getting Started: Step-by-Step Helm Installation Guide
Ready to get your hands dirty? Here's how to begin deploying NLWeb in your Kubernetes cluster.
Prerequisites Setup
Before deploying NLWeb, ensure you have the following prerequisites in place:
-
Add the Iunera Helm Repository:
helm repo add iunera https://iunera.github.io/helm-charts/ helm repo update -
Create Namespace and Secrets:
# Create namespace kubectl create namespace nlweb # Create secrets for API keys (replace with your actual keys) kubectl create secret generic nlweb-openai-secrets \ --from-literal=api-key="your-openai-api-key" \ -n nlweb kubectl create secret generic nlweb-azure-secrets \ --from-literal=vector-search-key="your-azure-search-key" \ --from-literal=openai-api-key="your-azure-openai-key" \ -n nlweb -
Install with Custom Values:
After preparing your
nlweb-values.yamlfile (using the advanced examples above as a starting point, tailored to your needs), you would apply it via FluxCD by committing it to your Git repository. For a direct Helm install (e.g., in a dev environment), you'd run:
helm install nlweb iunera/nlweb -n nlweb -f nlweb-values.yaml
This foundational setup, combined with the GitOps workflow, will get your NLWeb instance operational and ready to serve intelligent web experiences.
Conclusion
The landscape of AI-powered web applications is rapidly transforming, and NLWeb stands at the forefront, offering a unique blend of intelligence, flexibility, and cloud-native design. By embracing GitOps methodologies with FluxCD and leveraging the robust capabilities of Kubernetes and Azure, developers can create a truly production-ready ecosystem for NLWeb. This approach ensures automated, consistent, and auditable deployments, allowing teams to focus on innovation rather than operational overhead.
Whether you're building a new AI-driven knowledge hub or transforming existing web experiences, understanding and implementing these modern DevOps practices for NLWeb will be crucial for success. The combination of open-source innovation, declarative operations, and powerful cloud infrastructure provides an unstoppable force for the future of intelligent web applications. For further exploration into related fields, consider iunera.com's expertise in Apache Druid AI Consulting Europe or their advancements in Enterprise MCP Server Development.
We encourage you to dive deeper, experiment with NLWeb, and contribute to its evolving journey. The future of intelligent web is here, and it's powered by code, AI, and GitOps.
Top comments (0)