DEV Community

Cover image for GPU_WORKLOAD_MISMATCH: A Novel Security Finding Category for AI Container Workloads
Carnell Smith
Carnell Smith

Posted on

GPU_WORKLOAD_MISMATCH: A Novel Security Finding Category for AI Container Workloads

Defensive Publication: GPU_WORKLOAD_MISMATCH

A Novel Security Finding Category for AI Container Workloads

Author: Carnell Smith, Champtron Systems LLC

Date: June 9, 2026

Affiliation: NVIDIA Inception Member

Defensive publication notice: This document is published to establish prior art for the methods described herein and to help prevent third parties from obtaining patent protection over these techniques.


Abstract

This disclosure describes a method for detecting a previously unnamed class of security misconfiguration in containerized AI and GPU workload environments.

The method identifies the condition where a host system has GPU workload intent configured at the container runtime level, but no physical NVIDIA GPU, driver stack, or CUDA runtime is present on the host.

This condition is designated:

GPU_WORKLOAD_MISMATCH
Enter fullscreen mode Exit fullscreen mode

The condition creates operational security risk, compliance gaps, and unverifiable execution claims that existing container security tools may not detect.

This publication describes:

  • The detection method
  • The cross-check logic
  • The severity classification
  • The broader finding taxonomy within which this category exists
  • Related AI model, post-quantum cryptography, and remediation scoring methods

1. Background and Problem Statement

The proliferation of GPU-accelerated AI workloads in enterprise and government environments has created a new class of container security misconfiguration that existing security tools were not designed to detect.

Major commercial container security platforms commonly perform:

  • CVE scanning
  • Dockerfile analysis
  • Kubernetes manifest auditing
  • Runtime behavior monitoring
  • Secrets detection
  • Image vulnerability assessment

However, many of these tools operate without full awareness of the GPU and CUDA software stack.

A specific vulnerability class arises when a Docker host or container environment declares GPU workload intent, while the underlying host is physically and functionally incapable of GPU execution.


1.1 GPU Workload Intent Conditions

A host or container may indicate GPU workload intent through one or more of the following conditions.

Host-level GPU intent

A Docker host has the NVIDIA Container Runtime registered in its daemon configuration, either through:

/etc/docker/daemon.json
Enter fullscreen mode Exit fullscreen mode

or as reported by:

docker info
Enter fullscreen mode Exit fullscreen mode

This indicates intent to support GPU workloads.

Container-level GPU intent

One or more running containers declare GPU workload intent through any of the following indicators:

  • CUDA_VISIBLE_DEVICES environment variable
  • NVIDIA_VISIBLE_DEVICES environment variable
  • NVIDIA runtime assignment:
HostConfig.Runtime = "nvidia"
Enter fullscreen mode Exit fullscreen mode
  • Direct NVIDIA device mounts:
/dev/nvidia*
Enter fullscreen mode Exit fullscreen mode

inside:

HostConfig.Devices
Enter fullscreen mode Exit fullscreen mode

1.2 Missing GPU Capability Conditions

The risk condition exists when GPU workload intent is present and all of the following are also true:

  1. No physical NVIDIA GPU is detectable on the host through:
lspci
Enter fullscreen mode Exit fullscreen mode

or:

nvidia-smi
Enter fullscreen mode Exit fullscreen mode
  1. No NVIDIA driver is installed or functional.

  2. No CUDA runtime is present, including the absence of:

  • nvcc
  • libcudart.so
  • Valid CUDA_PATH environment variable

1.3 Security Risks

When this condition exists, the host presents a GPU-capable configuration surface to container workloads and orchestration systems while being incapable of GPU execution.

This creates several security and operational risks.

Unverifiable execution claims

Workloads that claim GPU-accelerated execution cannot be verified. Audit logs, compliance reports, and attestation records may contain false or unsupported claims about the execution environment.

Scheduling and routing trust violations

In federated or multi-node environments, a misconfigured host may accept GPU workloads it cannot execute. This can produce silent failures or unexpected CPU fallback behavior that is not surfaced to security or compliance teams.

Compliance gaps

Regulated environments such as DoD, healthcare AI, and financial services may require attestable GPU execution for AI model inference. When this condition is undetected, the organization cannot validate its compliance posture.

Configuration drift indicators

The condition may indicate:

  • Unauthorized modification of Docker daemon configuration
  • Partial uninstallation of GPU drivers
  • Hardware removal without configuration cleanup
  • Misaligned orchestration policy
  • Drift between declared runtime capability and actual host capability

Each of these is a security-relevant event.


2. Detection Method

The detection method executes independent checks against the host system and running containers, then evaluates the combined results through conservative cross-check logic.


2.1 Host-Level Checks

The minimum host-level checks are described below.

check_gpu_present()

Executes hardware enumeration checks such as:

lspci
Enter fullscreen mode Exit fullscreen mode

and/or:

nvidia-smi -L
Enter fullscreen mode Exit fullscreen mode

The check identifies whether physical NVIDIA GPU devices are present.

Expected return values include:

  • Boolean pass/fail result
  • Detected device name, where available
  • Detected device count, where available

check_nvidia_driver()

Executes:

nvidia-smi --query-gpu=name,driver_version
Enter fullscreen mode Exit fullscreen mode

This verifies whether the NVIDIA driver is installed and functional.

Expected return values include:

  • Pass/fail result
  • Driver version string, where available

check_cuda_runtime()

Checks for CUDA runtime availability by validating:

  • nvcc binary availability
  • Presence of libcudart.so in standard library paths
  • Validity of the CUDA_PATH environment variable

Expected return value:

  • Pass/fail result

2.2 Container-Level Checks

The following checks detect GPU workload intent at the Docker runtime and container level.

check_docker_gpu_runtime()

Queries Docker runtime configuration using:

docker info --format '{{json .Runtimes}}'
Enter fullscreen mode Exit fullscreen mode

and inspects:

/etc/docker/daemon.json
Enter fullscreen mode Exit fullscreen mode

The check looks for the presence of the nvidia runtime key.

Expected return value:

  • Pass/fail result

check_gpu_enabled_containers()

Iterates through running containers using:

docker ps -q
Enter fullscreen mode Exit fullscreen mode

Then inspects each container using:

docker inspect
Enter fullscreen mode Exit fullscreen mode

The check detects the following GPU workload indicators:

  • HostConfig.Runtime == "nvidia"
  • HostConfig.Devices containing paths matching /dev/nvidia*
  • Config.Env containing:
    • CUDA_VISIBLE_DEVICES
    • NVIDIA_VISIBLE_DEVICES
    • Other CUDA-related environment variables

Expected return value:

  • A list of containers with GPU workload indicators

2.3 Cross-Check Logic

The GPU_WORKLOAD_MISMATCH finding is derived through a cross-check function that evaluates the combined results of the individual checks.

IF (check_gpu_present() == FAIL)
   AND (check_nvidia_driver() == FAIL)
   AND (check_cuda_runtime() == FAIL)
   AND (
     check_docker_gpu_runtime() == PASS
     OR (check_gpu_enabled_containers() returns a non-empty container list)
   )
THEN raise GPU_WORKLOAD_MISMATCH finding
Enter fullscreen mode Exit fullscreen mode

This logic is intentionally conservative.

All three hardware, driver, and runtime checks must fail, confirming true absence of GPU capability.

At least one workload-intent indicator must also be present, confirming true intent to use GPU capability.

This conjunction prevents false positives on systems that are simply non-GPU hosts with no GPU configuration.


2.4 Finding Structure

When the cross-check condition is satisfied, a structured finding is produced.

Attribute Value
Category GPU_WORKLOAD_MISMATCH
Finding category number 13
Severity HIGH
Title GPU workload declared but no physical NVIDIA GPU detected
Description Docker or container settings indicate GPU workload intent, but no NVIDIA GPU, driver, or CUDA runtime was detected on the host.
Recommendation Verify host hardware, NVIDIA driver installation, NVIDIA Container Toolkit configuration, and whether the container should be scheduled on a GPU-capable node.
Check IDs References the five individual checks that contributed to the finding

2.5 Status Label Differentiation

A secondary method concerns the differentiation of container check status labels based on the cross-check result.

When the GPU_WORKLOAD_MISMATCH condition is present, check statuses are adjusted to avoid misleading pass/fail output.

Docker GPU runtime status

The Docker GPU runtime check would normally display as:

[PASS]
Enter fullscreen mode Exit fullscreen mode

because the runtime is registered.

However, when no physical GPU capability exists, it is relabeled as:

[WARN]
Enter fullscreen mode Exit fullscreen mode

This indicates that the configuration is present but cannot be validated against physical hardware.


Container GPU workload indicator status

The container GPU workload indicator check would normally display as:

[PASS]
Enter fullscreen mode Exit fullscreen mode

because GPU workload indicators were found.

However, when no physical GPU is present, it is relabeled as:

[WARN]
Enter fullscreen mode Exit fullscreen mode

with detail text similar to:

GPU workload indicators found in N container(s), but no physical NVIDIA GPU is available on this host.
Enter fullscreen mode Exit fullscreen mode

This differentiation gives operators a more accurate representation of the security state.

A simple pass/fail binary does not capture the risk of a partial GPU configuration.


3. Finding Category Taxonomy

This disclosure also describes a 13-category finding taxonomy for GPU, AI, and post-quantum cryptography security findings in containerized environments.

# Category Description
1 GPU_SECURITY General GPU hardware and configuration security
2 CUDA_HARDENING CUDA container runtime hardening
3 DRIVER_COMPLIANCE NVIDIA driver compliance and currency
4 CONTAINER_RUNTIME Container runtime security configuration
5 POLICY_VIOLATION Security policy violations
6 SECRETS_EXPOSURE Secrets and credentials exposure
7 LICENSE_RISK Software license compliance risk
8 STIG_FINDING DISA STIG control findings
9 CIS_FINDING CIS Benchmark findings
10 NIST_FINDING NIST SP 800-190 and related findings
11 AI_GOVERNANCE AI model security and governance findings
12 SUPPLY_CHAIN Software supply chain security
13 GPU_WORKLOAD_MISMATCH GPU workload intent declared without GPU capability

Each finding produced by any audit module is assigned exactly one category from this taxonomy.

This enables:

  • Cross-module correlation
  • Aggregation by category in dashboards
  • Structured reporting for compliance frameworks
  • Better prioritization of GPU, AI, PQC, and container security issues

4. AI Model Security Scanning Method

This disclosure additionally describes a method for scanning running container filesystems for embedded AI model files and evaluating their security posture.


4.1 Model Format Detection

The method scans container filesystems using commands such as:

docker exec <container_id> find / -type f
Enter fullscreen mode Exit fullscreen mode

It searches for files with extensions associated with AI model formats.

Extension Model Format
.onnx ONNX
.pt PyTorch
.pth PyTorch
.safetensors SafeTensors
.gguf GGUF / llama.cpp
.pkl Pickle
.pickle Pickle
.pb TensorFlow
.h5 Keras
.keras Keras

4.2 Unsafe Format Detection

Pickle-format model files are identified through the following extensions:

  • .pkl
  • .pickle

These formats are security-sensitive because deserialization may allow arbitrary code execution.

When detected, this finding is assigned:

Attribute Value
Category AI_GOVERNANCE
Severity HIGH
Risk Unsafe AI model deserialization path

4.3 Integrity Verification

The method checks for SHA256 hash sidecar files alongside model files.

Expected sidecar pattern:

<model_file>.sha256
Enter fullscreen mode Exit fullscreen mode

Models above a minimum size threshold, such as:

50 MB
Enter fullscreen mode Exit fullscreen mode

are flagged when no corresponding hash record is present.

This indicates missing model integrity verification.


4.4 CUDA Compute Mismatch

When a container image name or environment variables indicate CUDA or NVIDIA requirements, but host-level GPU checks confirm that no physical GPU is present, a CUDA_HARDENING finding is raised.

This finding indicates that the container's compute requirements cannot be met by the host.


4.5 LLM Endpoint Exposure

The method detects containers serving large language model inference by matching image name patterns associated with known LLM serving frameworks.

Examples include:

  • ollama
  • vllm
  • triton
  • tgi

The method checks for containers that expose inference ports on:

0.0.0.0
Enter fullscreen mode Exit fullscreen mode

and lack authentication-related configuration in environment variables.

This identifies LLM inference endpoints that may be exposed without adequate access control.


5. Post-Quantum Cryptography Container Scanning Method

This disclosure describes a method for detecting quantum-vulnerable cryptographic algorithm configurations in running container environments and mapping findings to NSA CNSA 2.0 compliance controls.


5.1 Detection Method

The method scans container environment variables and image labels for string patterns associated with quantum-vulnerable algorithms.

Patterns include references to:

  • RSA key specifications
  • ECDSA references
  • ECDH references
  • Diffie-Hellman parameters
  • SHA-1
  • MD5
  • AES-128 cipher specifications
  • TLS 1.2 configuration strings

5.2 Compliance Mapping

Each detected pattern is mapped to a CNSA 2.0 control identifier.

Control Area Control ID
Key encapsulation KE-1
Symmetric cipher requirements SC-1
Hash algorithm requirements HA-1
Transport protocol requirements TP-1

5.3 PQC Algorithm Detection

The method scans for references to CNSA 2.0-aligned or post-quantum cryptography algorithms, including:

  • ML-KEM
  • FIPS 203
  • ML-DSA
  • FIPS 204
  • SLH-DSA
  • FIPS 205
  • AES-256
  • SHA-384
  • SHA-512

5.4 Migration Label Checking

The method verifies the presence of Docker image labels documenting a PQC migration target date.

Example labels include:

pqc.migration_target
Enter fullscreen mode Exit fullscreen mode

and:

cnsa2.migration_date
Enter fullscreen mode Exit fullscreen mode

6. Autonomous Remediation Confidence Scoring Method

This disclosure describes a method for scoring container security findings on a 0–100 confidence scale to determine the appropriate remediation disposition.


6.1 Confidence Levels

Confidence Level Score Range Remediation Disposition
High 85–100 Auto-apply deterministic fix patterns
Medium 50–84 Queue for one-click operator approval
Low 0–49 Require full manual review

6.2 High-Confidence Remediation

High-confidence findings are those with known-safe, deterministic fix patterns.

Example:

--security-opt no-new-privileges:true
Enter fullscreen mode Exit fullscreen mode

These remediations may be auto-applied with a cryptographically signed evidence record.

The evidence record includes an HMAC-SHA256 signature over the before-and-after state.


6.3 Medium-Confidence Remediation

Medium-confidence findings have category-appropriate fix patterns but require operator context verification.

These are queued for one-click approval rather than automatically applied.


6.4 Low-Confidence Remediation

Low-confidence findings require manual review because they may involve structural or high-impact changes, such as:

  • Removing privileged mode
  • Changing root user behavior
  • Modifying volume mounts
  • Adjusting runtime permissions
  • Changing network exposure

6.5 Always-Manual Blocklist

A blocklist of finding titles is maintained under:

_ALWAYS_MANUAL
Enter fullscreen mode Exit fullscreen mode

This ensures that specific high-risk finding types never receive automatic remediation, regardless of confidence score.


7. Prior Art Statement

To the best of the author's knowledge, as of the date of this publication, no prior art exists for the following:

  1. The specific GPU_WORKLOAD_MISMATCH cross-check detection method described in Section 2.
  2. The 13-category finding taxonomy for GPU, AI, and PQC container security described in Section 3.
  3. The AI model Pickle format and integrity detection method in containerized environments described in Section 4.
  4. The CNSA 2.0 container scanning and mapping method described in Section 5.
  5. The confidence-scored autonomous remediation method with signed evidence described in Section 6.

This publication is intended to establish prior art for the above methods and to prevent any third party from obtaining patent protection covering these techniques.


8. Implementation

A working implementation of the methods described in this disclosure is available as CHAMP ContainerGuard Enterprise, developed by Champtron Systems LLC.

The implementation is maintained under version control with timestamped commit history establishing the dates of conception and reduction to practice for each method described herein.


Copyright and Notice

© 2026 Champtron Systems LLC. All rights reserved.

NVIDIA Inception Member.

This document is published as a defensive publication to establish prior art. All methods described herein are the intellectual property of Champtron Systems LLC.

Top comments (0)