Crawl4AI: When Web Scrapers Become File Servers
Vulnerability ID: GHSA-VX9W-5CX4-9796
CVSS Score: 8.6
Published: 2026-01-16
Crawl4AI, a popular tool for making web content LLM-friendly, inadvertently exposed a massive hole in its Docker API. By failing to validate URL schemes, it allowed unauthenticated attackers to use the file:// protocol to read local files from the server, turning a useful scraper into a highly effective data exfiltration tool.
TL;DR
The Crawl4AI Docker API accepted any URL scheme, including file://. Attackers could use endpoints like /execute_js to read sensitive local files (like /etc/passwd or environment variables) simply by asking the crawler to 'visit' them. This is a classic Local File Inclusion (LFI) vulnerability fixed in version 0.8.0.
⚠️ Exploit Status: POC
Technical Details
- Attack Vector: Network (API)
- CVSS: 8.6 (High)
- CWE: CWE-22 (Path Traversal)
- Privileges: None (Unauthenticated)
- Impact: High Confidentiality (File Read)
- Exploit Status: Functional PoC Available
Affected Systems
- Crawl4AI Docker API
- Crawl4AI Python Package
-
crawl4ai: < 0.8.0 (Fixed in:
0.8.0)
Exploit Details
- Context Analysis: Exploitation involves sending a POST request to /execute_js with a file:// URI schema.
Mitigation Strategies
- Input Validation: Enforce strict protocol allow-listing (HTTP/HTTPS only).
- Network Segmentation: Block access to internal metadata services and local networks.
- Container Hardening: Run containers with read-only filesystems and minimal environment variables.
Remediation Steps:
- Stop the running Crawl4AI container immediately.
- Pull the latest Docker image:
docker pull unclecode/crawl4ai:0.8.0(or newer). - Restart the service with the new image.
- Verify the fix by attempting the PoC payload against the new instance.
References
Read the full report for GHSA-VX9W-5CX4-9796 on our website for more details including interactive diagrams and full exploit analysis.
Top comments (0)