Deploying Stirling PDF on EKS with Helm, SSO, and Persistent Storage

#stirlingpdf #kubernetes #helm #sso

What I Built

The system is a self-hosted, compliance-aligned PDF processing platform running on AWS EKS to replace third-party SaaS alternatives. It fulfills structural requirements for data auditability and controllability by integrating into an existing corporate identity provider.

System Architecture

The setup relies entirely on the following internal and infrastructure components:

stirling-pdf-chart — upstream application chart version 3.1.0 declared as a clean dependency alias

Persistent Volume Claims — three 1Gi gp3 volumes providing non-shared storage for application data paths

AWS Secrets Manager / SecretStore — infrastructure provider for decoupling and pulling runtime SSO credentials

AWS Load Balancer Controller — routing component handling multi-service integration on a single load balancer

Core Technical Behavior

At runtime, the wrapper chart dynamically provisions three distinct 1Gi gp3 Persistent Volume Claims mapped directly to internal paths: /configs, /pipeline, and /usr/share/tessdata. The /usr/share/tessdata mount explicitly retains OCR language assets across container life cycles, preventing runtime re-downloading whenever a pod restarts.

Pod initialization timing is long due to security checks and OCR engine setup. Health tracking uses specific readiness and liveness timings:

Upstream chart dependency block

dependencies:
  - name: stirling-pdf-chart
    alias: stirling-pdf
    version: "3.1.0"
    repository: "https://stirling-tools.github.io/Stirling-PDF-chart"

Persistent volume loop layout

persistence:
  additionalVolumes:
    - name: configs
      size: 1Gi
    - name: pipeline
      size: 1Gi
    - name: tessdata
      size: 1Gi

Liveness checks delay for 120 seconds and retry every 30 seconds, tolerating up to 5 consecutive failures before forcing a restart. Readiness checks delay for 90 seconds and run every 15 seconds, hitting the /api/v1/info/status path to track initialization progress.

Security parameters strictly enforce authentication via external parameters:

SSO environment declaration

envsFrom:
  - secretRef:
      name: stirling-pdf-sso-secret

The runtime enforces DOCKER_ENABLE_SECURITY=true, SECURITY_OAUTH2_ENABLED=true, and SECURITY_ENABLELOGIN=true while maintaining active CSRF protection. To prevent processing failures during token exchanges and large file moves, SERVER_TOMCAT_MAX_HTTP_HEADER_SIZE is expanded to 65536 bytes, and the permitted form post size is raised to 10MB.

Ingress traffic routing uses an internal scheme linked via the AWS Load Balancer Controller:

Ingress group settings

alb.ingress.kubernetes.io/load-balancer-name: "stage-shared-alb"
alb.ingress.kubernetes.io/group.name: "stage"

Path rules selectively route public requests to static resources, the core API, and login endpoints, preventing direct discovery of any internal application paths.

Key Engineering Decisions

Wrapping the upstream chart as a dependency isolates lifecycle tracking. Upstream updates are consumed by advancing the dependency version string, separating core application changes from internal platform assets like network policies or volume claims.

Externalizing credentials via AWS Secrets Manager prevents leaking raw keys. Environment parameters ingest the secret values dynamically at deploy time using an ExternalSecret link, removing plaintext values from the repository configuration.

Consolidating services into a shared ALB group limits platform cost overhead. Setting a matching ingress group name allows the controller to attach multi-service routing rules to the stage-shared-alb instead of spinning up standalone, single-tenant load balancers.

Layering environment settings reduces configuration redundancy. A global values.yaml defines base platform baselines, whereas target environment configurations override only the exact parameters required for that specific deployment.

Trade-offs

Optimized for: operational simplicity, compliance alignment, cost efficiency on shared infrastructure.

Sacrificed:

Pod startup speed — conservative initialization periods and high failure thresholds lengthen application rollout times.
Multi-AZ storage resilience — ReadWriteOnce storage properties lock the gp3 persistent volumes to a single availability zone, preventing target pod scheduling onto healthy nodes if the original node fails before being confirmed dead.
Network security validation speed — NetworkPolicy configurations are disabled at the staging stage to reduce engineering friction during initial runtime testing.

Conclusion

This EKS architecture establishes a secure, self-hosted PDF utility integrated with an upstream Helm dependency model and externalized identity verification. The design limits infrastructure costs via a shared ingress deployment while preserving state across container restarts.

Wrapping upstream Helm charts rather than forking them keeps maintenance overhead low while retaining full control over platform-level behavior.

Need Help?

If you're working on similar infrastructure challenges — self-hosted tooling, EKS platform design, or Helm chart architecture — feel free to reach out at hello@jakops.cloud.