Overview
Path traversal (or directory traversal) is a vulnerability that allows an attacker to read or write arbitrary files on the server that is running an application.
For example, consider an application that loads images via some HTML like the following:
Copy <img src="/load?filename=218.png">
The /load
URL takes the filename
parameter and returns the contents of the specified file. The image files themselves are stored on disk in the location /var/www/images/
. To return an image, the application:
appends the requested filename 218.png
to the base directory /var/www/images/
,
uses a filesystem API to read the contents of the /var/www/images/218.png
file.
This behaviour can be abused by an attacker. An attacker can request the following URL to retrieve an arbitrary file from the server:
Copy https://vulnerable-website.local/load?filename=../../../etc/passwd
This causes the application to read from the following file path:
Copy /var/www/images/../../../etc/passwd
As a result, the application will return to an attacker a content of the /etc/passwd
.
You can find more details at .
This page contains recommendations for the implementation of protection against path traversal attacks.
General
Do not allow a user to control data that is passed to the file system API.
Use an allow list of paths to validate user-provided data if possible.
Include only alphanumeric characters in an allow list of characters.
If it is necessary to use special characters such as .
or /
, prevent the use of combinations represented as regular expressions below.
Copy \A\.\z
\A\.\.\z
\A\.\.[/\\]
[/\\]\.\.\z
[/\\]\.\.[/\\]
\A/
\A~
\n
\r
Do not rely solely on a block list validation, it can be bypassed in many cases.
Methods used to construct file paths can have non-intuitive behaviour. Make sure the methods you use work as expected with data like this:
Copy .
..
/
~/some/path
/etc/passwd
../../../etc/passwd
/../../../etc/passwd
../../../../../../../../../../../../etc/passwd
ClarificationFor example, the Pathname.join
method in Ruby, which joins pathnames, handles absolute names unintuitively.
Copy require 'pathname'
p = Pathname.new('tmp')
user_controlled_input = 'etc/passwd'
print(p.join('log', user_controlled_input, 'foo'))
# => tmp/log/etc/passwd/foo
user_controlled_input = '/etc/passwd'
print(p.join('log', user_controlled_input, ''))
# => /etc/passwd
As can be seen, if user_controller_input
contains an absolute path, Pathname.join
will ignore everything up to the argument with the absolute path. In other words, it will allow an attacker to craft an arbitrary path.
Use a sandbox to obtain or save data.
Implementing the canonical path validation
Go Java Python
Copy package main
import (
"fmt"
"path"
"strings"
)
func inTrustedBasePath(basePath string, userControlledPath string) bool {
basePath = path.Join(basePath)
fullPath := path.Join(basePath, userControlledPath)
if !strings.HasPrefix(fullPath, basePath+"/") {
return false
}
return true
}
func main() {
basePath := "/tmp/path/to/base/folder"
userControlledData := "nothing/dangerous.txt"
fmt.Println(inTrustedBasePath(basePath, userControlledData))
// => true
userControlledData = "../../../path/traversal"
fmt.Println(inTrustedBasePath(basePath, userControlledData))
// => false
}
Copy import java.io.*;
public class CanonicalPathValidation {
public static void main(String []args){
String basePath = "/tmp/path/to/base/folder";
String userControlledData = "nothing/dangerous.txt";
System.out.println(inTrustedBasePath(basePath, userControlledData));
// => true
String basePath = "/tmp/path/to/base/folder";
String userControlledData = "../../../path/traversal";
System.out.println(inTrustedBasePath(basePath, userControlledData));
// => false
}
private static boolean inTrustedBasePath(String basePath, String userControlledPath) {
try {
File file = new File(basePath, userControlledPath);
if (!file.getCanonicalPath().startsWith(basePath)) {
return false;
}
return true;
} catch (IOException e) {
return false;
}
}
}
Copy import os
from pathlib import Path
def in_trusted_base_path(base_path: str, user_controlled_path: str) -> bool:
base_path = Path(base_path)
full_path = Path(base_path, user_controlled_path)
real_path = os.path.realpath(full_path)
return Path(real_path).is_relative_to(base_path)
def main() -> None:
base_path = '/tmp/path/to/base/folder'
user_controlled_data = 'nothing/dangerous.txt'
print(in_trusted_base_path(base_path, user_controlled_data))
# => True
base_path = '/tmp/path/to/base/folder'
user_controlled_data = '../../../path/traversal'
print(in_trusted_base_path(base_path, user_controlled_data))
# => False
References