Pickle Rick-Roll: Critical RCE in vLLM's Mooncake Integration
Vulnerability ID: CVE-2025-32444
CVSS Score: 10.0
Published: 2025-04-30
A critical Remote Code Execution (RCE) vulnerability exists in vLLM's Mooncake distributed KV cache transfer mechanism. The flaw stems from the use of insecure Python pickling over unauthenticated ZeroMQ sockets bound to all network interfaces.
TL;DR
vLLM developers used Python's pickle serialization over exposed ZeroMQ sockets for the Mooncake integration. This allows any attacker who can reach the port to send a malicious packet and gain instant root execution on the GPU cluster. CVSS 10.0.
⚠️ Exploit Status: POC
Technical Details
- CWE: CWE-502 (Deserialization of Untrusted Data)
- CVSS v3.1: 10.0 (Critical)
- Attack Vector: Network (Remote)
- Library: pyzmq (recv_pyobj)
- Impact: Remote Code Execution (Root/User)
- Protocol: ZeroMQ (TCP)
Affected Systems
- vLLM Inference Engine
- Mooncake KV Transfer System
-
vLLM: >= 0.6.5, < 0.8.5 (Fixed in:
0.8.5)
Code Analysis
Commit: a5450f1
Fix CVE-2025-32444 by replacing pickle with struct pack/unpack
- src_ptr, length = self.receiver_socket.recv_pyobj()
+ data = self.receiver_socket.recv_multipart()
+ src_ptr = struct.unpack("!Q", data[0])[0]
Exploit Details
- Theoretical: Standard Python Pickle RCE utilizing reduce method to execute shell commands.
Mitigation Strategies
- Disable Mooncake integration if not actively required.
- Isolate vLLM instances behind strict network firewalls (VPC/Security Groups).
- Run vLLM as a non-root user to limit the impact of RCE.
Remediation Steps:
- Stop the running vLLM service.
- Pull the latest Docker image or update the pip package:
pip install -U vllm>=0.8.5. - Verify configuration ensures
KV_TRANSFERsockets are bound to internal IPs only. - Restart the service.
References
Read the full report for CVE-2025-32444 on our website for more details including interactive diagrams and full exploit analysis.
Top comments (0)