DEV Community

Titouan Despierres
Titouan Despierres

Posted on

Shipping Java AI Services on Kubernetes in 2026: A Practical CI/CD Playbook (GitHub Actions + GitLab CI + Argo CD)

Shipping Java AI Services on Kubernetes in 2026: A Practical CI/CD Playbook (GitHub Actions + GitLab CI + Argo CD)

If you build production systems with Java and Kubernetes, 2026 feels different from even two years ago.

The stack is more capable, but also less forgiving:

  • Java moved forward fast (JDK 21 in production, JDK 25 now on the adoption table)
  • AI features are no longer “labs-only” and now sit in real SLAs
  • Kubernetes keeps evolving APIs and operational defaults
  • CI/CD shifted from “just run tests” to provenance, policy, and progressive delivery

This article is a field playbook: what changed, why it’s better (or what trade-offs you pay), and how to adopt safely in production.


1) Java in 2026: LTS strategy is now an architecture decision

Many teams are on Java 17 or 21. The strategic question now is not if you move, but how often you want to absorb JVM/runtime changes.

What changed

  • JDK 21 proved itself in production (virtual threads adoption matured, better startup/perf ergonomics).
  • JDK 25 (LTS) is now the next serious target for teams wanting fresh runtime optimizations and longer runway.
  • Teams increasingly separate:
    • language/runtime cadence (JDK upgrades)
    • framework cadence (Spring, Micronaut, Quarkus)

Why it’s better

  • Better throughput-per-core and latency consistency on modern workloads.
  • Easier concurrency scaling with virtual threads for I/O-heavy services.
  • Cleaner migration planning when platform teams enforce one baseline JDK per environment.

Trade-offs

  • JDK upgrades surface hidden assumptions (reflection, unsafe libs, GC tuning folklore).
  • Preview/incubator features are tempting but can create policy debates in regulated environments.

Adoption strategy (production-safe)

  1. Move from “lift-and-pray” to benchmark-driven migration.
  2. Enforce runtime parity between CI and cluster images.
  3. Make rollback trivial: immutable image tags + GitOps revision pin.

Useful baseline command set:

# JVM + GC shape under realistic load
java -Xlog:gc*:stdout:time -jar app.jar

# JDK migration smoke check
jdeps --multi-release 21 --recursive app.jar
Enter fullscreen mode Exit fullscreen mode

2) AI in production: model choice became a platform concern, not an app concern

In many companies, AI features now sit inside Java APIs (classification, extraction, assistant workflows, anomaly triage). The biggest shift is operational: model routing, fallback, and cost controls are now platform-level primitives.

What changed

  • Teams run multi-model strategies (fast/small + accurate/expensive fallback).
  • Inference is increasingly exposed behind internal gateways with quotas and observability.
  • “Prompt quality” alone is no longer enough; SLOs and budget guardrails decide architecture.

Why it’s better

  • Better cost/performance by matching model size to request class.
  • Easier governance (PII handling, rate limits, auditability) when centralized.
  • Faster iteration when app teams consume a stable internal AI contract.

Trade-offs

  • More infra complexity (gateways, retries, circuit breakers, prompt/version tracing).
  • Evaluation debt: you need regression datasets, not just anecdotal testing.

Practical Java pattern

Keep application code model-agnostic:

public interface AiClient {
    AiResult infer(AiRequest request);
}
Enter fullscreen mode Exit fullscreen mode

Then route by policy (latency tier, tenant, budget) outside business logic.

And for reliability, treat AI calls like any other remote dependency:

  • timeout budgets
  • bulkheads
  • fallback behavior
  • semantic caching where safe

3) Kubernetes: API drift and operations policy are now daily reality

Kubernetes evolution is steady, and the operational impact is cumulative. Teams that delay upgrades too long pay the “API cliff” tax later.

What changed

  • Deprecated APIs keep disappearing across releases.
  • Security defaults and admission controls are tighter in modern clusters.
  • Progressive delivery expectations grew: canary/blue-green is becoming standard for critical services.

Why it’s better

  • Better security posture by default.
  • Fewer ambiguous deployment states.
  • Better resilience when rollout policy is encoded instead of tribal.

Trade-offs

  • Manifest maintenance is continuous work, not one-off.
  • Platform teams must own upgrade rehearsal and compatibility scanning.

Adoption strategy

  • Run a scheduled API deprecation scan in CI.
  • Keep app manifests simple; push complexity to platform-level templates.
  • Normalize probes/resources/securityContext defaults.

Example deployment baseline:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: orders-api
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 0
      maxSurge: 1
  template:
    spec:
      securityContext:
        runAsNonRoot: true
      containers:
        - name: app
          image: ghcr.io/acme/orders-api:1.12.0
          ports:
            - containerPort: 8080
          readinessProbe:
            httpGet: { path: /actuator/health/readiness, port: 8080 }
          livenessProbe:
            httpGet: { path: /actuator/health/liveness, port: 8080 }
          resources:
            requests: { cpu: "250m", memory: "512Mi" }
            limits: { cpu: "1000m", memory: "1Gi" }
Enter fullscreen mode Exit fullscreen mode

4) CI/CD in 2026: from pipeline automation to delivery governance

The old target was “green pipeline.” The new target is “trusted, reversible delivery.”

That means three things:

  1. Build provenance/security signals
  2. Progressive deploy policy
  3. Fast rollback path

GitHub Actions example (build + scan + push)

name: build-and-push
on:
  push:
    branches: [main]

jobs:
  build:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      id-token: write
      packages: write
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-java@v4
        with:
          distribution: temurin
          java-version: '25'
      - name: Build
        run: ./gradlew clean test bootJar
      - name: Build image
        run: |
          docker build -t ghcr.io/acme/orders-api:${{ github.sha }} .
          docker push ghcr.io/acme/orders-api:${{ github.sha }}
Enter fullscreen mode Exit fullscreen mode

GitLab CI example (test + image + manifest update)

stages: [test, build, deploy]

variables:
  IMAGE: registry.gitlab.com/acme/orders-api:$CI_COMMIT_SHA

test:
  stage: test
  image: eclipse-temurin:25
  script:
    - ./gradlew test

build_image:
  stage: build
  image: docker:27
  services: [docker:27-dind]
  script:
    - docker build -t $IMAGE .
    - docker push $IMAGE

deploy_gitops:
  stage: deploy
  image: alpine:3.20
  script:
    - apk add --no-cache git yq
    - git clone https://gitlab-ci-token:${CI_JOB_TOKEN}@gitlab.com/acme/platform-config.git
    - cd platform-config/apps/orders-api/overlays/prod
    - yq -i '.images[0].newTag = env(CI_COMMIT_SHA)' kustomization.yaml
    - git config user.email "ci@acme.com"
    - git config user.name "gitlab-ci"
    - git commit -am "orders-api: $CI_COMMIT_SHA"
    - git push
Enter fullscreen mode Exit fullscreen mode

This is the key shift: application repo builds artifacts; config repo controls runtime state.


5) GitOps with Argo CD: make rollback boring

Argo CD works best when teams optimize for predictability, not cleverness.

Core production pattern

  • One app, one clear source of truth path
  • Environment overlays (dev/stage/prod) via Helm or Kustomize
  • Sync policies explicit; auto-sync only where blast radius is understood

Example Argo CD Application:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: orders-api-prod
spec:
  project: prod
  source:
    repoURL: https://github.com/acme/platform-config.git
    targetRevision: main
    path: apps/orders-api/overlays/prod
  destination:
    server: https://kubernetes.default.svc
    namespace: orders
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
      - CreateNamespace=true
Enter fullscreen mode Exit fullscreen mode

Rollback strategy that actually works

  • Keep previous N image tags retained in registry.
  • Revert config commit (not kubectl patch in prod).
  • Measure rollback MTTR as a platform KPI.

If rollback is manual and stressful, your delivery system is unfinished.


6) A realistic 90-day adoption roadmap

If your current baseline is Java 17/21 + Kubernetes + ad-hoc CI, here is a pragmatic rollout:

Days 1–30: Stabilize the foundation

  • Inventory JDK/runtime versions and unsupported dependencies.
  • Standardize base container image and JVM flags.
  • Add deployment health checks and resource defaults.
  • Introduce config repo for at least one service.

Days 31–60: Secure and govern delivery

  • Add image scanning + dependency checks in CI.
  • Enforce branch protections and required pipeline gates.
  • Add Argo CD for one production service with clear rollback drill.
  • Define AI request classes and model routing policy.

Days 61–90: Optimize for speed and reliability

  • Migrate a high-traffic Java service to latest target JDK.
  • Add canary rollout policy for critical services.
  • Instrument end-to-end latency (API + model + DB) with SLO dashboards.
  • Track DORA + rollback MTTR + AI cost/request in one operational view.

Final take

The winning pattern in 2026 is not “adopt every new thing fast.”

It is:

  • Upgrade deliberately (Java + Kubernetes)
  • Operationalize AI (routing, reliability, cost controls)
  • Treat CI/CD as governance (not just automation)
  • Use GitOps to make rollback routine

If your team can answer, at any moment, what changed, why it changed, and how to revert safely, you are ahead of most organizations already.

That’s what mature platform engineering looks like today.

Top comments (0)