👨‍💻
Application Security Handbook
  • Application Security Handbook
  • Web Application
    • Authentication
      • Authentication with Login and Password
      • Authentication with Phone Number
      • OAuth 2.0 Authentication
      • Multi-factor Authentication
      • Default Passwords
      • Password Change
      • Password Policy
      • Password Reset
      • Password Storage
      • One Time Password (OTP)
      • Email Address Confirmation
    • Authorization
    • Concept of Trusted Devices
    • Content Security Policy (CSP)
    • Cookie Security
    • Cryptography
      • Cryptographic Keys Management
      • Encryption
      • Hash-based Message Authentication Code (HMAC)
      • Hashing
      • Random Generators
      • Universal Unique Identifier (UUID)
    • Error and Exception Handling
    • File Upload
    • Input Validation
    • JSON Web Token (JWT)
    • Logging and Monitoring
    • Output Encoding
    • Regular Expressions
    • Sensitive Data Management
    • Session Management
    • Transport Layer Protection
    • Vulnerability Mitigation
      • Brute-force
      • Command Injection
      • Cross-Site Request Forgery (CSRF)
      • Cross-Site Scripting (XSS)
      • Mass Parameter Assignment
      • Parameter Pollution
      • Path Traversal
      • Regular Expression Denial of Service (ReDoS)
      • SQL Injection (SQLi)
      • XML External Entity (XXE) Injection
Powered by GitBook
On this page
  • Overview
  • General
  • File name sanitization
  • File extension validation
  • Content-Type validation
  • MIME Sniffing implementation
  • File storage
  • Authorization
  • Size and rate limits
  • Content Disarm and Reconstruction (CDR)
  • CDR Images
  • CDR Multimedia files
  • CDR Content and Markup Languages
  • CDR Compressed files
  • CDR Documents
  • Dangerous file types
  • References
  1. Web Application

File Upload

PreviousError and Exception HandlingNextInput Validation

Last updated 1 year ago

Overview

This page contains recommendations for the implementation of secure file upload functionality.

General

  • Process files with well-known and up-to-date libraries and frameworks.

  • Implement a comprehensive validation of each uploaded file, which must include at least:

File name sanitization

    • Enforce a limit to file name. For example, require the length of a file name to be between 1 and 64 characters.

    • Define an allow-list of characters for file names. The allow list must consist of alphanumeric characters, hyphens, spaces, and period characters.

Example

The regex \A[A-Za-z0-9._-]{1,64}\z can be used to validate that a file name only includes alphanumeric, hyphen, space and period characters within a minimum of 1 until a maximum of 64 characters length.

File extension validation

    • Define an allow list of extensions. Do not use block list validation.

Clarification

Unfortunately, block list validation may miss unknown bad values that an attacker could leverage to bypass the validation. For example, mixing lowercase and uppercase characters; adding a valid extension before or after an invalid extension; adding special characters within or at the end of the extension; or a combination of the previous bypassing techniques.

file.pHp
file.png.php
file.php.png
file.php%20
file.php%00
file.php%00.png
file.png.jpg.php
file.PhP%00.pNg%00.jpG
file.%E2%80%AEphp.jpg

Content-Type validation

  • Define an allow list of MIME types and validate the Content-Type HTTP header against this list. Do not use a block list validation.

Note

The browser sends the following Content-Type header for each uploaded file in form-data requests:

POST /photo/upload HTTP/1.1
Content-Length: 1337
Content-Type: multipart/form-data; boundary=---------------------------974767299852498929531610575

-----------------------------974767299852498929531610575
Content-Disposition: form-data; name="description"

some text
-----------------------------974767299852498929531610575
Content-Disposition: form-data; name="myFile"; filename="foo.txt"
Content-Type: text/plain

(content of the uploaded file foo.txt)
-----------------------------974767299852498929531610575--

This Content-Type header is often provided as MIME-type by language/frameworks.

MIME Sniffing implementation

The following libraries can be used to implement MIME sniffing:

File storage

Clarification

Using UUID as a file name for uploaded files is the best way to avoid vulnerabilities associated with arbitrary file names.

  • Store uploaded files outside of the webroot folder.

  • Store uploaded files separate from application content and code. For example, use a separate volume or a third-party storage.

  • Define an appropriate directory structure accordingly to your file system.

Clarification

Most file systems have theoretical and practical limits on the amount of files/directories that can be stored within a single directory or file system. Moreover, there are some performance issues when a single directory contains a huge amount of files.

Therefore, it is important to be aware of file system limits and create a directory structure that mitigates these issues.

There is no standard for creating such a directory structure, but a good approach that distributes files within a directory structure is as the following:

  1. Take the first two characters from the UUID file name.

  2. If does not exist, create a directory with those two characters as the folder name.

  3. Store the file within the directory named with those two characters from its UUID.

The following represents the directory structure result of using this approach:

FileStorage/
    18/
        1859b296-8bc3-4612-8179-8bec1dcaac16
    1f/
        1f663aa5-cef7-4f6a-85b2-48294de0cbcc
    97/
        973eb004-44ec-4900-8109-240076b649ba
    e9/
        e9bd6269-ef70-49d8-a73d-5781c30e89a9
  • Set up file storage and uploaded files with the minimum necessary set of permissions according to the purpose of use.

Example
  • Application users and groups have read and write permissions to store files.

  • Administrative accounts (backup or monitoring) have read-only permissions.

  • Any other users do not have access to the file storage.

  • If files should not be modified, set permissions to read-only.

  • If files can be modified by an application (on a user's update request), set permissions for writing only to the application users and groups.

  • Reset permissions for all uploaded files.

  • Make sure the storage is encrypted at rest.

  • Make sure there is a backup strategy for the storage.

  • Make sure there is storage capacity monitoring.

Clarification

Storage capacity must be monitored to avoid reaching its maximum capacity and leading to a denial of service.

  • If file execution is needed, the file must be scanned with an anti-malware solution and executed only within a separate and sandboxed environment.

  • Use third-party services to manage file-uploading functionality.

Clarification

There are third-party systems that can handle the file-uploading functionality and provide enterprise-grade functionality and security. Some of the most popular are:

  • Use a separate domain/service to store/host uploaded files.

Example

If the main domain is website.local, use a separate domain such as user-content.website-static.local to store uploaded files.

  • If files need to be modified by an application (on the user's update request as an example), set up versioning on files.

Authorization

  • Allow file uploads only for authenticated users.

  • Validate a user has the necessary permissions to upload files.

  • Allow access to uploaded files only for registered users.

  • Do not expose uploaded files to anonymous users.

Size and rate limits

  • Set a limit on the size of the uploaded file and validate that the file size does not exceed the specified limit before handling it.

    • Use the Content-Length header to identify a file size.

    • Do not rely on user-controlled parameters such as form values to identify a file size.

Clarification

If no restriction on maximum file size is established, an attacker can submit huge files to fill up the storage's quota and lead to a denial of service on the system.

You should not rely on a user-controlled parameter such as GET, POST, or PUT parameters, because an attacker can easily bypass size validation using that parameter to set an arbitrary size and upload bigger files than allowed.

  • Limit the number of uploaded files per each request.

  • Limit the amount of file upload requests (single or multiple upload file requests) a single user can perform within a reasonable fixed period of time according to an application and user needs.

Clarification

The limit must be set according to an application's requirements and common usage.

For instance:

  • A photo album application could allow 100 photo file uploads within 2 minutes.

  • A ticketing tracking system could allow 5 files per minute.

Content Disarm and Reconstruction (CDR)

Many dangerous file types can be used to attack an application and pose risks for the entity that receives and stores such files and users that access them. Content Disarm and Reconstruction (CDR) techniques allow you to mitigate these risks.

  • Use a third-party solution to perform CDR on known dangerous files before any further processing.

CDR Images

Image uploading may lead to different security threats like the following ones:

  • Embedding malicious code into metadata.

  • Cross-Site Scripting or XML external entity (XXE) injection through SVG images.

  • Memory leaks due to image error processing.

  • Remote code execution due to improper image processing.

  • Create an allow list of image formats and work only with images of these formats.

    • Avoid processing SVG files.

  • If it is necessary to work with SVG files implement extra protection layers to reduce the blast radius:

  • Use well-known and up-to-date libraries to process images.

  • Process images (cropping, resizing, etc.) in a sandboxed environment.

Clarification

Most imaging libraries are quite complex solutions that support dozens of different file types. Often it leads to high severe vulnerabilities. Since many of these libraries are written in low-level languages, this provides an attacker with additional opportunities for remote code execution and accessing arbitrary memory regions. Experience has shown that we can not trust such libraries. They should be isolated to reduce the damage radius in case of exploitation.

Process images in a sandboxed environment (such as an isolated container or a third-party system) to mitigate exploitation of vulnerabilities in used libraries.

  • Remove metadata from images such as EXIF metadata.

Clarification

Metadata may contain Personal Identifiable Information (PII), for example:

  • Date and time.

  • Geo-localization.

  • Camera or phone model and settings.

  • etc.

That information will be available to all users, including attackers, who can access such images.

Moreover, metadata can be used by an attacker to deliver malicious payloads.

Example

You can add or subtract 1 from each RGB byte. This will add subtle noise to the image output but will make the output unpredictable for an attacker to leverage it for exploitation.

CDR Multimedia files

Multimedia file uploading may lead to different security threats like the following ones:

  • Memory leaks due to multimedia file error processing.

  • Remote code execution due to improper multimedia file processing.

  • Server-side Request Forgery due to improper multimedia file processing.

  • Use well-known and up-to-date libraries to process audio and video files.

  • Process audio and video files in a sandboxed environment.

Clarification

Process audio and video files in a sandboxed environment (such as an isolated container or a third-party system) to mitigate exploitation of vulnerabilities in used libraries.

CDR Content and Markup Languages

XML and HTML file uploading may lead to different security threats like the following ones:

  • Arbitrary JavaScript execution via malicious HTML.

  • Arbitrary JavaScript execution via malicious CSS.

  • XML External Entity injection via malicious XML.

  • Avoid processing HTML files.

  • If it is necessary to work with HTML files implement extra protection layers to reduce the blast radius:

CDR Compressed files

Compressed file uploading may lead to different security threats like the following ones:

  • Denial of service due to unpacking a Zip bomb file.

  • Path traversal during unpacking compressed files leads to overwriting existing files.

  • Accessing restricted files due to improper handling of symbolic links in compressed files.

  • Verify decompress output size before unpacking files.

  • Remove symbolic links from the unpacked files.

  • Reset permissions for all unpacked files.

  • Make sure unpacking does not overwrite existing files.

  • Validate each unpacked file according to the requirements on this page.

CDR Documents

Microsoft and PDF document uploading may lead to different security threats like the following ones:

  • Execution of arbitrary JavaScript embedded into a PDF document.

  • Embedding malicious files to a PDF document.

  • Remote code execution due to improper PDF processing.

  • Malicious macros, Flash, OLE objects or HTA Handlers in Microsoft documents (Word, Excel or PowerPoint).

Dangerous file types

Processing some types of files is potentially dangerous because it can easily lead to a vulnerability. It is recommended to avoid processing such files and block the upload of any files with the appropriate extensions. If it is necessary to accept any of these files, it is highly recommended to handle such files in a sandbox, implement extra layer protections (like Content Security Policy) and perform Content Disarm and Reconstruction (CDR) to reduce the blast radius.

Below you can find a list of potentially dangerous files. This list is not exhaustive.

Extensions

.config, .ini, .htaccess, .htpasswd, .xml

Extensions

.zip, .rar, .tar, tgz, .tar.gz

MIME Types

  • application/gzip

  • application/zip

  • application/vnd.rar

  • application/x-tar

Extensions

.bat, .cmd, .com, .cpl, .csh, .dll, .exe, .hta, .inf, .jar, .js, .msi, .msc, .msp, .mpkg, .ps1, .ps1xml, .ps2, .ps2xml, .psc1, .psc2, .sh, .swf, .vb, .vbe, .vbs, .ws, .wsf, .wsc, .wsh

MIME Types

  • application/vnd.microsoft.portableexecutable

  • application/java-archive

  • application/x-csh

  • application/vnd.apple.installer+xml

  • application/x-sh

  • application/x-shellscript

  • application/x-msdownload

  • application/x-shockwave-flash

  • text/javascript

Extensions

.svg

MIME Types

image/svg+xml

Extensions

.aac, .avi, .mp3, .mp4, .mpeg, .oga, .ogv, .wav

MIME Types

  • audio/aac

  • audio/mpeg

  • audio/ogg

  • video/x-msvideo

  • video/mp4

  • video/mpeg

  • video/ogg

  • audio/wav

Extensions

  • ASP .asp, .aspx, .config, .ashx, .asmx, .aspq, .axd, .cshtm, .cshtml, .rem, .soap, .vbhtm, .vbhtml, .asa, .cer, .shtml

  • Coldfusion .cfm, .cfml, .cfc, .dbm

  • Erlang Yaws Web Server .yaws

  • Flash .swf

  • JSP .jsp, .jspx, .jsw, .jsv, .jspf, .wss, .do, .action

  • Perl .pl, .cgi

  • PHP .php, .php2, .php3, .php4, .php5, .php6, .php7, .phps, .phps, .pht, .phtm, .phtml, .pgif, .shtml, .htaccess, .phar, .inc

  • Python .py

  • XML .xml

MIME Types

  • application/x-httpd-php

  • application/xml

  • text/xml

  • text/x-python

  • application/x-python-code

References

Follow file storage best practices, see the section.

Set file size limits and implement upload rate limits, see the section.

Implement protection against path traversal attacks, see the page.

Log successful and unsuccessful file upload attempts and access to files events, see the page.

Implement Content Disarm and Reconstruction (CDR) for , see the section.

Comply with requirements from the page.

Implement a comprehensive input validation for file names, see the page.

Implement a comprehensive input validation for file extensions, see the page.

Note that input data must be normalized before any comparison, validation or processing. If normalization is not done properly, allow list validation can be bypassed easily, check the page.

Avoid uploading files with potentially dangerous extensions, see the section.

Implement to extract the effective MIME type of a file's content.

Avoid uploading files with potentially dangerous MIME types, see the section.

The Content-Type header can be easily manipulated by a user and spoofed to bypass Content-Type verifications. Therefore, it can not be fully trusted. However, there is a content sniffing algorithm known as that provides a more in-depth Content-Type validation and can be used as a workaround for this issue.

library

Function at package

library

wrapper for

Use Universal Unique Identifier (UUID) as a name for uploaded files, see the page.

Enforce a strict Content Security Policy to the service that hosts uploaded files, see the page.

Scan every uploaded file with an anti-malware solution such as .

Comply with the requirements from the page.

Implement protection against Cross-Site Request Forgery (CSRF) attacks for file upload functionality, see the page.

Do not rely on to upload files.

Using chunked transfer encoding is also not recommended, because there is no way to identify a file size. Moreover, there is that abuse differences in parsing requests with Content-Length and Transfer-Encoding headers to smuggle HTTP requests.

Implement CDR on images before any further processing, see the section.

Implement CDR on multimedia files before any further processing, see the files section.

Implement CDR on XML and HTML files before any further processing, see the section.

Implement CDR on compressed files before any further processing, see the section.

Implement CDR on Microsoft Office documents and PDFs before any further processing, see the section.

Implement SVG sanitization, see .

Host images on a separate domain, see the section.

Enforce a strict Content Security Policy to that domain, see the page.

Insert random noise into image content, see the page.

The security threats for processing audio and video files are similar to those described in the section.

Implement HTML sanitization, see .

Use a separate domain to store/host uploaded HTML files, see the section.

Enforce strict Content Security Policy to that domain, see the page.

Comply with the requirements from the page for XML files.

Comply with requirements from the page.

Scan Microsoft and PDF documents with an anti-malware solution such as .

Vulnerability Mitigation: Path Traversal
Logging and Monitoring
Error and Exception Handling
Input Validation
Input Validation
Normalization section in the Input Validation
MIME Sniffing
libmagic
DetectContentType()
net/http
mmmagic
python-magic
libmagic
Cryptography: Universal Unique Identifier (UUID)
Filestack
Transloadit
Cloudinary
Uploadcare
Content Security Policy (CSP)
VirusTotal
Authorization
Vulnerability Mitigation: Cross-Site Request Forgery (CSRF)
chunked transfer encoding
a class of vulnerabilities
DOMPurify
Content Security Policy (CSP)
Cryptography: Random Generators
DOMPurify
Content Security Policy (CSP)
Vulnerability Mitigation: XML External Entity (XXE)
Vulnerability Mitigation: Path Traversal
VirusTotal
OWASP Cheat Sheet Series: File Upload Cheat Sheet
File name sanitization
File extension validation
Content-Type validation
File storage
Size and rate limits
potentially dangerous file types
Content Disarm and Reconstruction (CDR)
Dangerous file type
MIME Sniffing
Dangerous file types
CDR Images
CDR Multimedia
CDR Content and Markup Languages
CDR Compressed files
CDR Documents
File storage
CDR Images
File storage