Sudip Sengupta

Posted on Sep 26, 2022

What is Directory Traversal in Cyber Security?

In a standard web server directory, the root folder is the publicly accessible folder that can be accessed when a user types a website’s domain name on the address bar. The root directory contains the site’s index file and paths to all other files and directories used in the system. These files are only accessible to server-side code or the web application itself.

An adversary can read these files and directories in vulnerable applications to access information such as password files, application data, source code, and OS files. Such attacks typically leverage directory traversal vulnerabilities and are referred to as directory/path traversal or directory climbing attacks.

This article discusses the types and various prevention measures for directory traversal attacks.

What is a Path Traversal Vulnerability?

Directory traversal vulnerability, also known as the file path traversal vulnerability, allows attackers to read arbitrary files on the web application server. The exposure is considered one of the most critical application security risks in modern development frameworks, which allows a threat actor with unauthorized access to exploit restricted files and execute bad code on the server by altering file path names in the request.

Any server software that fails to validate data input from browsers is susceptible to directory vulnerabilities. Once the root directory is accessible, hackers rely on guesswork or exposed directory trees to determine the name and location of sensitive server files. Attackers can also read and write arbitrary files used by the server software, allowing them to perform malicious actions such as:

Accessing sensitive files such as credentials and source code files
Manipulation of application data and functionality through arbitrary code execution
Impersonation of privileged users such as administrators to gain their access rights

In the absence of appropriate input validation, attacks based on path traversal vulnerabilities are typically easy to orchestrate as it only requires basic knowledge of HTTP requests and server file handling.

Directory Traversal Attack Examples

Path traversal vulnerabilities exist on server files or server-side application code and are commonly found in various programming languages. These include:

Directory Traversal in Python

Developers rely on the Django framework to build secure and maintainable Python web applications. While the framework offers several developer-friendly features, Django applications are known to contain vulnerabilities that allow attackers to perform directory traversal attacks. Anonymous hackers can perform file traversal on a Django web app by altering the URL filename parameter with ../ (dot-dot-slash) characters, also known as server escape codes or path traversal sequences.

Consider an application with an <img> element that looks similar to:

<img src=”/getImage?
filename=darwin_sample_image.jpg” />

The getImage URL request accepts the user-supplied filename parameter and returns the specified file object. Assuming these images are stored in a directory, such as /darwin/menu/images, the application appends the file name with the path to the directory, turning it into /darwin/menu/images/darwin_sample_image, which the application’s API uses to access the file. The attacker may modify the URL to access restricted files if the application accepts malformed inputs. For example, to gain access to the password file, the attacker can alter URL input values to:

https://darwin-vulnerable-site/getImage?
filename=../../../etc/passwd

Since the base directory is appended to the request, the above URL is assumed within a valid directory path, thereby turning the target folder to: /restaurant/menu/images/../../../etc/passwd. This request can grant the attacker access to the parent directory and allow access to files in the /etc/passwd folder. While Django checks for illegal characters in requests by default, skilled hackers can use a combination of alphanumeric characters and escape codes to bypass these security checks.

Path Traversal in Java

In Java-based applications, threat actors can exploit weak access control implementations on Java servlets to perform path traversal attacks. For example, the HTML code to upload a file with a Java servlet and the directory traversal vulnerability would look similar to:

<form action="FileUploadServlet" method="post" 
enctype="multipart/form-data">
Choose a file to upload:
<input type="file" name="darwin-filename"/>
<br/>
<input type="submit" name="submit" value="Submit"/>
</form>

Once the form is submitted, the servlet receives the request and uses its doPOST method to obtain the required string filename from the request header. Next, the servlet opens the file, reads its contents, and outputs the file to the application’s upload directory, as shown in the code snippet below:

public class FileUploadServlet extends HttpServlet {

...




protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {

response.setContentType("text/html");

PrintWriter out = response.getWriter();

String contentType = request.getContentType();




int ind = contentType.indexOf("boundary=");

String boundary = contentType.substring(ind+9);




String pLine = new String();

String uploadLocation = new String(UPLOAD_DIRECTORY_STRING); //Constant value




if (contentType != null && contentType.indexOf("multipart/form-data") != -1) {

// extract the filename from the Http header

BufferedReader br = new BufferedReader(new InputStreamReader(request.getInputStream()));

...

pLine = br.readLine();

String filename = pLine.substring(pLine.lastIndexOf("\\"), pLine.lastIndexOf("\""));

...




try {

BufferedWriter bw = new BufferedWriter(new FileWriter(uploadLocation+filename, true));

for (String line; (line=br.readLine())!=null; ) {

if (line.indexOf(boundary) == -1) {

bw.write(line);

bw.newLine();

bw.flush();

}

} //end of for loop

bw.close();





} catch (IOException ex) {...}

// output successful upload response HTML page

}

// output unsuccessful upload response HTML page

else

{...}

}

...

}

This code does not validate user input values to verify the uploaded file type, eventually allowing the threat actor to upload arbitrary files with malicious code, compromising the application and its underlying data.

Directory Traversal in PHP

Consider a website that uses a PHP script to serve static content from a target folder:

$page = $_GET;  
$filename = "/pages/$page";  
$file\_handler = fopen($filename, "r");  
$contents = fread($file_handler, filesize($file));  
fclose($file_handler);  
echo $contents;

The URL extension for the markup of the About page contained in /pages/about would look similar to:

view.php?page=about

A malicious actor can craft a malformed request to access the Admin page or other sensitive contents by modifying the URL as follows:

view.php?page=../admin/login.php

Without proper user input validation, the hacker can access admin credentials, allowing them to view other valuable information.

Path Traversal Vulnerability in Apache HTTP Server

The open-source Apache HTTP web server contains potential vulnerabilities that allow attackers to orchestrate path traversal attacks. These include the CVE-2021-41773 and CVE-2021-42013 vulnerabilities that commonly occur during code changes to enhance path normalization. Such code changes implement insufficient security validation for Unicode values, allowing attackers to bypass path traversal detection techniques. Attackers combine this failed security check with a directive misconfigured to grant access to the entire web server filesystem, as shown:

<Directory />
Require all granted
</Directory>

Using the above technique, an attacker can craft a malformed input to access restricted content and files using common paths.

Approaches to Prevent Directory Traversal Attacks

Access Controls Lists (ACLs) and user input validation form the first line of defense against directory traversal attacks. Dynamic Application Security Testing and Static Application Security testing techniques also help to check application code for vulnerabilities in directory listings. Besides, most programming languages and web development frameworks include guidelines on securing critical system files and environment hardening.

The following section discusses directory traversal security fixes techniques for common development frameworks.

Path Traversal Vulnerability Fix In Java

Developers can prevent path traversal by safely listing all acceptable inputs and possible paths that should be accessible to the public. A typical code for safe listing paths in Java would be similar to:

if (VALID_PATHS.contains(normalized_path)) {
        File file = new File(BASE_PATH, normalized_path);
        if (file.getCanonicalFile().toPath().startsWith(Paths.get(BASE_PATH)) {
            String content = new String(Files.readAllBytes(file.toPath()));
            return content;
        } else {
            return "Access Error";
        }
    } else {
        return "Access Error";
    }

Path Traversal Vulnerability Fix in Python

Security mechanisms to prevent path traversal in Python applications include:

Using the latest web frameworks and software versions that have path traversal prevention mechanisms to help detect alternative character representations
Using the os.path.repath function to change the actual filenames and requested paths into relative file paths

Path Traversal Vulnerability Fix in C

Path traversal vulnerability fixes for C# web applications include:

Using the GetInvalidFileNameChars() path class method to identify invalid characters in the specified request parameter
Using indirect object references to map resource location and avoid supplying user input to filesystem APIs
Combining the absolute file path check with sanitization of user-supplied path names and file extension validation

FAQs

What is the difference between file inclusion attacks and directory traversal?

Directory traversal attacks only let the attacker access restricted and sensitive files. On the other hand, in file inclusion attacks, the hacker loads malicious code within the file and executes it in the application’s context.

What are arbitrary files?

In cybersecurity, an arbitrary file represents a malicious file that allows threat actors to modify operating system functions, server files, application source code, and other system settings. These files are built outside the scope of the application and are then included as input to filesystem APIs, offering attackers access to limited functionalities of the system.

This article has already been published on https://crashtest-security.com/path-traversal-vulnerability/ and has been authorized by Crashtest Security for a republish.

DEV Community