DEV Community

Etairos.ai
Etairos.ai

Posted on • Originally published at threat-intelligence.redeyesecurity.com

Google Vertex AI SDK Flaw Let Attackers Hijack Model Uploads With Nothing but a Project ID

TL;DR

  • what: Google's Vertex AI SDK for Python generated a predictable staging bucket name and checked only that it existed, not who owned it, letting an attacker pre-create the bucket and intercept a victim's model upload.
  • impact: A swapped pickle/joblib model executed attacker code inside Vertex AI's serving container, stealing an OAuth token that reached other tenants' model artifacts, BigQuery metadata, tenant logs, GKE names, and internal image paths.
  • fix: Google fully fixed it in google-cloud-aiplatform v1.148.0 (April 15, 2026) by adding bucket ownership verification; an interim uuid4 fix shipped in v1.144.0, and no CVE has been assigned.
  • who: Anyone uploading models via the Vertex AI Python SDK without setting an explicit staging_bucket, especially new projects in a region where the default bucket does not yet exist.

A flaw in the Google Cloud Vertex AI SDK for Python let an attacker with no access to a victim's project hijack that victim's machine learning model upload and execute code inside Google's serving infrastructure. The only inputs required were a Google Cloud project the attacker already controlled and the victim's project ID, which is frequently public. No credentials, no phishing, no foothold.

Palo Alto Networks Unit 42 found and reported the bug, naming the technique "Pickle in the Middle." Google has patched it. If you use the SDK, update google-cloud-aiplatform to version 1.148.0 or later and set an explicit staging bucket. Unit 42 saw no exploitation in the wild, but the attack is cheap, fast, and requires almost nothing from the target.

The bug: a name check, not an ownership check

When a developer uploaded a model without specifying a staging bucket, the SDK generated a predictable Cloud Storage bucket name from the project ID and region, in the form project-vertex-staging-region. It then checked whether that bucket existed, but never verified that the victim actually owned it. Because Cloud Storage bucket names are globally unique, an attacker who knew the project ID could create the expected bucket first in their own project. The victim's SDK would then dutifully upload model files into the attacker's bucket.

From there the attacker replaced the uploaded model with a malicious one. Most Python ML models are serialized with pickle or joblib, formats that execute arbitrary code at load time. When Vertex AI later loaded the swapped model, the attacker's code ran inside the serving container.

A 2.5-second race the attacker wins

The attack hinged on speed. Unit 42 measured roughly 2.5 seconds between the victim's upload completing and Vertex AI reading the file. In their proof of concept, a Cloud Function triggered on upload and replaced the model in 1.4 seconds, comfortably inside the window. The payload then pulled an OAuth token from the serving container's metadata server and exfiltrated it to the attacker.

⚠️ The stolen token was not scoped to the deployment — In Unit 42's test environment, the OAuth token reached well beyond the compromised model. It could access other model artifacts in the same Google-managed tenant project, including a full TensorFlow model with trained weights, plus BigQuery metadata, access lists, tenant logs, GKE cluster names, and internal container image paths. A single hijacked upload became a cross-tenant data exposure.

When the attack works

The technique required two conditions to line up. First, the victim's default staging bucket must not already exist in that region, which is common for a new Vertex AI project. Second, the developer must have left the staging_bucket parameter unset and relied on the SDK default. Both are ordinary defaults rather than misconfigurations, which is what makes the issue dangerous: the vulnerable path is the convenient one.

  • No victim staging bucket yet exists in the target region (typical for new projects)
  • The developer did not set staging_bucket and used the SDK default
  • The attacker knows the victim's project ID (often public)
  • The attacker has any GCP project of their own to host the squatted bucket

Disclosure and the fix

Unit 42 reported the flaw through Google's Vulnerability Reward Program on March 5, 2026, testing versions 1.139.0 and 1.140.0 and finding both vulnerable. Google shipped an initial fix in v1.144.0 on March 31, appending a random uuid4 to the bucket name to break predictability. It completed the fix in v1.148.0 on April 15 by adding bucket ownership verification to block squatting in Model.upload(). As of publication, neither Unit 42 nor Google's Vertex AI security bulletins have assigned a CVE.

What to do now — Update to google-cloud-aiplatform 1.148.0 or later so the ownership check is active. Set an explicit staging_bucket pointing to a Cloud Storage location you control on every model upload. Because the flawed logic lives in the client SDK, audit the version everywhere it runs, including notebooks, CI jobs, and training pipelines, not just production services.

A pattern, not a one-off

This is the second predictable-bucket-name flaw in Vertex AI this year. Google patched CVE-2026-2473 in February, a separate bucket-squatting bug in Vertex AI Experiments that also enabled cross-tenant code execution, model theft, and poisoning. Unit 42's earlier research on Vertex AI's default service-agent permissions traced a related path from a deployed AI agent into customer and tenant data. The recurring theme is clear: predictable resource names plus insecure deserialization plus over-scoped tenant tokens turn an ML pipeline into a cross-tenant attack surface. Treat model artifacts as executable code, control your storage locations explicitly, and stop trusting SDK defaults to be safe.


Originally published on RedEye Threat Intelligence.

Top comments (0)