Path Traversal

Overview

Path traversal (or directory traversal) is a vulnerability that allows an attacker to read or write arbitrary files on the server that is running an application.

For example, consider an application that loads images via some HTML like the following:

<img src="/load?filename=218.png">

The /load URL takes the filename parameter and returns the contents of the specified file. The image files themselves are stored on disk in the location /var/www/images/. To return an image, the application:

appends the requested filename 218.png to the base directory /var/www/images/,
uses a filesystem API to read the contents of the /var/www/images/218.png file.

This behaviour can be abused by an attacker. An attacker can request the following URL to retrieve an arbitrary file from the server:

https://vulnerable-website.local/load?filename=../../../etc/passwd

This causes the application to read from the following file path:

/var/www/images/../../../etc/passwd

As a result, the application will return to an attacker a content of the /etc/passwd.

You can find more details at PortSwigger Web Security Academy: Directory traversal.

This page contains recommendations for the implementation of protection against path traversal attacks.

General

Do not allow a user to control data that is passed to the file system API.
Generate random folder/file names using a UUID or random generator instead of relying on user-provided names, see the Cryptography: Universal Unique Identifier (UUID) and Cryptography: Random Generators pages.
If data can be controlled by a user, implement comprehensive input validation for all data that is passed to the file system API, see the Input Validation page.
- Use an allow list of paths to validate user-provided data if possible.
- Include only alphanumeric characters in an allow list of characters.
- If it is necessary to use special characters such as . or /, prevent the use of combinations represented as regular expressions below.
  \A\.\z \A\.\.\z \A\.\.[/\\] [/\\]\.\.\z [/\\]\.\.[/\\] \A/ \A~ \n \r
- Do not rely solely on a block list validation, it can be bypassed in many cases.
Make sure the canonicalized path starts with the expected base directory, see the Implementing the canonical path validation section.
Methods used to construct file paths can have non-intuitive behaviour. Make sure the methods you use work as expected with data like this:
```
.
..
/
~/some/path
/etc/passwd
../../../etc/passwd
/../../../etc/passwd
../../../../../../../../../../../../etc/passwd
```

Clarification

For example, the Pathname.join method in Ruby, which joins pathnames, handles absolute names unintuitively.

require 'pathname'

p = Pathname.new('tmp')

user_controlled_input = 'etc/passwd'
print(p.join('log', user_controlled_input, 'foo'))
# => tmp/log/etc/passwd/foo

user_controlled_input = '/etc/passwd'
print(p.join('log', user_controlled_input, ''))
# => /etc/passwd

As can be seen, if user_controller_input contains an absolute path, Pathname.join will ignore everything up to the argument with the absolute path. In other words, it will allow an attacker to craft an arbitrary path.

Use a sandbox to obtain or save data.

Implementing the canonical path validation

Use the path.Join function to get a canonical path.

package main

import (
    "fmt"
    "path"
    "strings"
)

func inTrustedBasePath(basePath string, userControlledPath string) bool {
    basePath = path.Join(basePath)
    fullPath := path.Join(basePath, userControlledPath)
    if !strings.HasPrefix(fullPath, basePath+"/") {
        return false
    }
    return true
}

func main() {
    basePath := "/tmp/path/to/base/folder"
    userControlledData := "nothing/dangerous.txt"
    fmt.Println(inTrustedBasePath(basePath, userControlledData))
    // => true

    userControlledData = "../../../path/traversal"
    fmt.Println(inTrustedBasePath(basePath, userControlledData))
    // => false
}

Use the File.getCanonicalPath method to get a canonical path.

import java.io.*;

public class CanonicalPathValidation {

    public static void main(String []args){
        String basePath = "/tmp/path/to/base/folder";
        String userControlledData = "nothing/dangerous.txt";
        System.out.println(inTrustedBasePath(basePath, userControlledData));
        // => true

        String basePath = "/tmp/path/to/base/folder";
        String userControlledData = "../../../path/traversal";
        System.out.println(inTrustedBasePath(basePath, userControlledData));
        // => false
    }

    private static boolean inTrustedBasePath(String basePath, String userControlledPath) {
        try {
            File file = new File(basePath, userControlledPath);
            if (!file.getCanonicalPath().startsWith(basePath)) {
                return false;
            }
            return true;
        } catch (IOException e) {
            return false;
        }
    }
}

Use the os.path.realpath function to get a canonical path.

import os
from pathlib import Path


def in_trusted_base_path(base_path: str, user_controlled_path: str) -> bool:
    base_path = Path(base_path)
    full_path = Path(base_path, user_controlled_path)
    real_path = os.path.realpath(full_path)
    return Path(real_path).is_relative_to(base_path)


def main() -> None:
    base_path = '/tmp/path/to/base/folder'
    user_controlled_data = 'nothing/dangerous.txt'
    print(in_trusted_base_path(base_path, user_controlled_data))
    # => True

    base_path = '/tmp/path/to/base/folder'
    user_controlled_data = '../../../path/traversal'
    print(in_trusted_base_path(base_path, user_controlled_data))
    # => False

References

GitLab Docs: Secure coding development guidelines

PreviousParameter Pollution NextRegular Expression Denial of Service (ReDoS)

Last updated 2 years ago