Jer Catallo

Posted on Jun 14

Path Traversal: What It Is, Why It's Dangerous, and How to Stop Attackers from Reading Files They Shouldn't

#cybersecurity #tutorial #todayilearned #security

Path traversal is a web vulnerability where an attacker reads files outside the directory your application intends to serve. It sounds simple, but the impact can be severe. One missing validation and an attacker can walk straight into your .env, your database backups, or your cloud credentials using nothing more than ../ in a URL.

The mock server used in this demonstration is available at https://github.com/jercatallo/lab-path-traversal if you want to follow along and try the vulnerability yourself.

Path traversal, also called directory traversal, happens when an application takes a filename from user input and builds a file path with it, without checking where that path actually leads. Attackers chain ../ sequences to climb up the directory tree and reach files the application was never meant to expose.

Common targets include:

Environment files with database credentials
Configuration files with cloud API keys
Database backups with user data and password hashes
Internal service credentials

Path Traversal vs LFI vs RFI

These three vulnerabilities are related but different. Here is a quick comparison:

Feature	Path Traversal	LFI	RFI
File source	Local filesystem	Local filesystem	Remote server
Execution	Read only	Can execute code	Can execute code
Main goal	Read sensitive files	Read or execute local files	Execute remote malicious code
Common payload	`../` sequences	`../` + file paths	`http://attacker.com/shell.php`

Path traversal only reads files. LFI goes further and can execute them. RFI pulls code from an external server entirely. This demonstration focuses on path traversal.

Vulnerable Code Examples

Path traversal happens when developers trust user input without validating the resulting file path. Here are common patterns that cause this vulnerability.

PHP - Direct file path concatenation without validation:

<?php
$filename = $_GET['file'];
$filepath = '/var/www/files/' . $filename;
readfile($filepath);
?>

The code takes the file parameter directly from the URL and appends it to the base path. If an attacker sends ?file=../.env, the server reads /var/www/files/../.env which resolves to /var/www/.env.

Python - Missing path boundary check:

from flask import Flask, request, send_file
import os

app = Flask(__name__)
BASE_DIR = "/var/www/files"

@app.route("/files")
def serve_file():
    filename = request.args.get("file")
    filepath = os.path.join(BASE_DIR, filename)
    return send_file(filepath)

The os.path.join() function does not prevent directory traversal. When filename is ../.env, the result is /var/www/files/../.env which the server resolves to /var/www/.env.

Node.js - Unsafe path construction:

const express = require('express');
const fs = require('fs');
const path = require('path');
const app = express();

app.get('/files', (req, res) => {
  const filename = req.query.file;
  const filepath = path.join(__dirname, 'files', filename);
  res.sendFile(filepath);
});

The path.join() method preserves ../ sequences. An attacker sending ?file=../config/production.json gets the full production configuration file.

Java - File object without canonical path validation:

@WebServlet("/files")
public class FileServlet extends HttpServlet {
    protected void doGet(HttpServletRequest req, HttpServletResponse resp) {
        String filename = req.getParameter("file");
        File file = new File("/var/www/files", filename);
        // Missing: check if file.getCanonicalPath() starts with base directory
        Files.copy(file.toPath(), resp.getOutputStream());
    }
}

The File constructor accepts any path string. Without calling getCanonicalPath() and verifying it stays within the allowed directory, traversal sequences pass through unchecked.

The common mistake across all these examples is the same: user input flows directly into file path construction without checking where the final path actually points.

Ethical Considerations

All credentials, tokens, and data shown here are simulated values created for this lab. The environment runs on a local machine with no real user data. Never test path traversal on systems you do not own or have written permission to test. Unauthorized file access is illegal in most jurisdictions.

Step 1: Finding Available Endpoints

Before attacking, we need to discover what endpoints exist on the target application. In real-world scenarios, attackers use fuzzing, directory brute-forcing, or source code review to find these. This lab provides an /api endpoint that lists available routes for training purposes.

curl http://localhost:8080/api | jq

The response reveals two endpoints: /api itself (which lists endpoints) and /files which serves files using a file query parameter. The /files endpoint is our target for path traversal. In a real engagement, you would not have this convenient listing and would need to discover it through reconnaissance techniques.

Step 2: Normal File Access

The application runs on localhost:8080 with a /files endpoint that serves user profile icons. We start by requesting a legitimate file to confirm normal behavior.

curl http://localhost:8080/files?file=user-1.svg

The server returns the full SVG markup for the profile icon. The endpoint works as expected when given a valid filename within the intended directory. No errors, no restrictions visible from the outside.

Step 3: Escaping the Directory with `../`

Now we test if the server validates the file parameter. We use ../ to move one level up from the files directory and target an admin credentials file.

curl http://localhost:8080/files?file=../admin/credentials.txt

The server returns the credentials file without any error. You can see the fake admin email and password, three API tokens with different permission levels (read-only, read-write, admin), and internal service URLs for the payment and user services. The application accepted the ../admin/ path without any validation. This confirms path traversal is possible.

Step 4: Reading the Environment File

With directory traversal confirmed, we go after the .env file. This file typically holds database connection details that the application loads at startup.

curl http://localhost:8080/files?file=../.env

The response exposes five environment variables: DATABASE_HOST, DATABASE_PORT (5432, a PostgreSQL port), DATABASE_USER, DATABASE_PASSWORD, and DATABASE_NAME. In a real application, these details give an attacker direct access to your database server. One request, and the database is open.

Step 5: Pulling a Database Backup

Backup files are another high-value target. They are often stored on the same server and contain a full snapshot of your data.

curl http://localhost:8080/files?file=../backups/db_dump.sql

The dump includes the full schema and data for two tables. The users table has usernames, bcrypt password hashes, emails, and roles. The payments table has card numbers, CVV codes, and expiry dates. An attacker reading this file gets everything needed to attempt credential stuffing and has access to payment data that could lead to PCI DSS violations.

Step 6: Exposing Production Configuration

The last target is the production configuration file. These files often bundle multiple secrets together in one place.

curl http://localhost:8080/files?file=../config/production.json

The JSON file reveals three sections: database connection details with a separate internal hostname (fake-db-host-name.internal), AWS credentials with an access key ID and secret access key, and a JWT signing secret with a 7-day expiry. With the AWS keys, an attacker can enumerate and access cloud resources. With the JWT secret, they can forge authentication tokens and impersonate any user.

Step 7: Remediation

Path traversal is preventable. The root cause is the application trusting user input to build file paths. Here are the key fixes:

Resolve the canonical path: Use realpath() or equivalent to resolve the full path before doing anything with it.
Validate against an allowlist: Check that the resolved path starts with the intended base directory. Reject anything outside it.
Avoid raw path construction: Use file IDs or database references instead of accepting filenames directly from users.
Apply least privilege: Run the application with a user that only has read access to the files directory. Even if traversal succeeds, the process cannot read files it has no permission to access.

Secure path validation example in Python:

import os

ALLOWED_DIR = "/var/www/files"

def serve_file(filename):
    full_path = os.path.realpath(os.path.join(ALLOWED_DIR, filename))
    if not full_path.startswith(ALLOWED_DIR + os.sep):
        raise ValueError("Access denied: path traversal detected")
    return open(full_path, "rb").read()

The key line is the startswith check after realpath(). This resolves any ../ sequences before the check, so an attacker cannot bypass it by encoding or chaining traversal sequences.

Summary

Path traversal is a low-effort, high-impact vulnerability. All five steps above used a single curl command with ../ in the filename parameter. No special tools, no authentication bypass, just a missing validation check on the server side.

The attack chain here went from reading a profile icon to exposing admin credentials, database environment variables, a full database backup with payment data, and AWS cloud credentials. Each of those steps is one request. Protect your applications by resolving canonical paths, validating against a strict base directory, and running with least privilege.

DEV Community

Path Traversal: What It Is, Why It's Dangerous, and How to Stop Attackers from Reading Files They Shouldn't

Path Traversal vs LFI vs RFI

Vulnerable Code Examples

Ethical Considerations

Step 1: Finding Available Endpoints

Step 2: Normal File Access

Step 3: Escaping the Directory with `../`

Step 4: Reading the Environment File

Step 5: Pulling a Database Backup

Step 6: Exposing Production Configuration

Step 7: Remediation

Summary

Top comments (0)

Path Traversal vs LFI vs RFI

Vulnerable Code Examples

Ethical Considerations

Step 1: Finding Available Endpoints

Step 2: Normal File Access

Step 3: Escaping the Directory with ../

Step 4: Reading the Environment File

Step 5: Pulling a Database Backup

Step 6: Exposing Production Configuration

Step 7: Remediation

Summary

Step 3: Escaping the Directory with `../`