Pentesting, Network Security and System Administration: Path-traversal (a.k.a. directory traversal) attacks

Path Traversal

Web servers are generally set up to restrict public access to a specific portion of the server’s filesystem, typically called the “Web document root” directory. This directory contains files and any scripts that provide Web application functionality.
In a path-traversal attack, an intruder manipulates a URL in such a way that the Web server executes, or reveals the contents of, a file anywhere on the server — including outside the document root. Such attacks take advantage of special-character sequences in URL input parameters, cookies and HTTP request headers.
The most basic path traversal attack uses the ../ character sequence to alter the document or resource location requested in a URL. Although most Web servers prevent this method from escaping the Web document root, alternate encodings of the ../ sequence, such as Unicode-encoding, can bypass basic security filters. Even if a Web server properly restricts path-traversal attempts in the URL path, any application that exposes an HTTP-based interface is also potentially vulnerable to such attacks.

Note: For UNIX systems, the parent directory is ../ while in Windows it is ...

Attack Scenario 1

The valid URL http://www.example.com/scripts/database.php?report=quarter1.txt is used to display a text file. However, manipulating it into a malicious URL like
http://www.example.com/scripts/database.php?report=../scripts/database.php%00txt will force the PHP application to display the source code of the database.php file, treating it as a text file whose contents are to be displayed.
The attacker uses the ../ sequence to traverse one directory above the “current” directory, and enter the /scripts directory. The %00 sequence is used both to bypass a simple file extension check, and to cutoff the extension when the file is read and processed by PHP. This example highlights the critical importance of always checking and cleaning user-supplied input before allowing it to be processed.

Attack Scenario 2

The PHP code below accepts a username, and then opens a file specific to that username. It can be exploited by passing a username that causes it to refer to a different file:

$username = $_GET['user'];

$filename = "/home/users/$username";

readfile($filename);

Let’s suppose a normal URL is www.example.com/profile.php?user=arpit.pdf. If an attacker passes a changed query string to make a malicious URL like www.example.com/profile.php?user=../../etc/passwd then PHP will read /etc/passwd and output that to the attacker.
Path-traversal attacks mostly target an application’s file upload, download, and display functionalities, like those often found in work-flow applications (where users can share documents); in blogging and auction applications (when users upload images); and in informational applications (when users retrieve documents like ebooks, technical manuals, and company reports).
In many cases, there may be security measures in place — filters for forward or backward slashes — but here too, attackers can try simple encoded representations of traversal sequences, such as those shown the following table.

Table 1: Some encoding schemes
Character	URL encoding	16-bit Unicode encoding	Double URL encoding</>
dot	%2e	%u002e	%252e
forward slash	%2f	%u2215	%252f
backslash	%5c	%u2216	%255c

The time for security

Path traversal attacks can be addressed with the following security measures:

There’s really no good reason for Apache to be allowed to serve files outside of its document root. Any request for files outside the document root is highly suspect, so we’ll restrict it to a directory structure with the following directives in the httpd.conffile:

<Directory />

Order Deny, Allow

Deny from all

Options none

AllowOverride none

</Directory>

<Directory www>

Order Allow, Deny

Allow from all

Options -Indexes

</Directory>

(Replace the www directory name with whatever you’ve called your Web server’s document root. The Options -Indexes line in the <Directory www> section disables directory browsing, securing the server from directory-traversal attacks.)
Apart from this, ensure the user account of the Web server or Web application is given the least read permissions possible for files outside the Web document root. Also, change the default locations of your Web root directories.
After performing all relevant decoding and Canonicalisation of user-submitted filenames, validate all inputs so that only an expected character set (such as alpha-numeric) is accepted. The validation routine should be especially aware of shell meta-characters such as / and “and” command-concatenation characters (&& for Windows shells and the semi-colon for UNIX shells).
Set a hard limit for the length of a user-supplied value. Note that this step should be applied to every parameter passed between the client and server, not just parameters the user is expected to modify via text boxes or similar input fields.
The application should use a predefined list of permissible file types, and reject any request for a different type. It is better to do this before the decoding and Canonicalisation has been performed.
Any request containing path-traversal sequences should be logged as an attempted security breach, generating an alert to an administrator, terminating the user’s session, and if applicable, suspending the user’s account.
realpath() and basename() are two functions PHP provides to help avoid directory-traversal attacks. realpath() translates any . or .. in a path, resulting in the correct absolute path for a file. For example, the $filename in Attack Scenario 2, passed to realpath(), would return just /etc/passwd. On the other hand, basename() strips the directory part of a name, leaving just the filename itself. Using these two functions, it is possible to rewrite the script of Attack Scenario 2in a much more secure manner:

$username = basename(realpath($_GET['user']));

$filename = "/home/users/$username";

readfile($filename);

Tools of the secure trade

Dotdotpwn is a very flexible Perl-based intelligent fuzzer tool, which detects several directory-traversal vulnerabilities on HTTP/FTP servers. For Windows systems, it also detects the presence of boot.ini on vulnerable systems through directory-traversal vulnerabilities. It is available for free download on its website, along with its documentation.
This web resource contains many path-traversal URLs that are frequently used by attackers. This domain also contains other good resources on security and ethical hacking.

Source-code disclosure

Source-code disclosure, another variant of the path-traversal attack, is a widely prevalent vulnerability in Web applications, which lets attackers extract source code and configuration files. Such vulnerabilities are found mostly in websites that offer to download files using dynamic scripts.
The attacker uses this technique to obtain the source code of server-side scripts like ASP, JSP or PHP files, to discover Web application logic, including database structure, source code comments, parameters and other possibly exploitable vulnerabilities of the code. Let’s understand this using a simple attack scenario.

An attack scenario

Let’s assume a website uses the following PHP code, which initiates a file download from the server:

<?php

if(isset($_GET[‘file’]))

{

$file = $_GET[‘file’];

readfile($file);

}

?>

A valid URL for the above script is http://www.example.com/downloads.php?file=arpit.zip, but the attacker’s malicious URL could be http://www.example.com/downloads.php?file=login.php, which returns to the attacker the contents of the file login.php. With this, the attacker learns about the filters and checks in login.php, and even the names of other crucial database and systems configuration files.
To secure your application from source-code disclosure, the application should use a predefined list of permissible file types, and reject any request for a different type.

Directory listing leakage

This is a commonly-found vulnerability in many Web servers. When a Web server receives a request for a directory rather than an actual file, it may respond in one of three ways:

It may return a (configurable) default resource within the directory, such as index.html, home.html, default.htm, default.asp, default.aspx, index.php, etc.
It may return an HTTP status code 403 error message, indicating that the request is not permitted.
It may return a listing showing the contents of the directory. This happens when default resources are not present in the directory.

In many situations, directory listings do not have any relevance to security. For example, disclosing the listing of an images directory may be completely inconsequential. Indeed, directory listings are often intentionally allowed because they provide a built-in means of navigating sites containing static content. But still, some files and directories are often, unintentionally, left within the Web root of servers, including:

Application-generated files: Web-authoring applications often generate files that find their way to the server. A good example is a popular FTP client, WS_FTP, which places a log file into each folder it transfers to the Web server. Since people often transfer folders in bulk, the log files themselves are transferred, exposing file paths and allowing the attacker to enumerate all files. Another example is CityDesk, which places a list of all files in the root folder of the site, in a file named citydesk.xml.
Configuration-management files: Configuration-management tools create many files with metadata. Again, these files are frequently transferred to the website. CVS, the most popular configuration-management tool, keeps its files in a special folder named CVS. This folder is created as a sub-folder of every user-created folder, and it contains the files Entries, Repository, and Root.
Backup files: Text editors often create backup files with extensions such as ~, .bak, .old, .bkp, and .swp. When changes are performed directly on the server, backup files remain there. Even when created on a development server or workstation, by virtue of a bulk folder FTP transfer, they end up on the production server.
Exposed application files: Script-based applications often consist of files not meant to be accessed directly, but instead used as libraries or subroutines. Exposure happens if these files’ extensions are not recognised by the Web server as a script. Instead of executing the script, the server sends the full source code in response to a request. With access to the source code, the attacker can look for security-related bugs. Also, these files can sometimes be manipulated to circumvent application logic.
Web server’s crucial information: This is often displayed at the end of the listing page, and contains the server name, version number and other important information. Such information can be used to launch specific exploits against the Web server.

Apart from the above, there are many other files which should not be disclosed publicly, such as temporary files, renamed old files, user’s private home folders, etc.
Now, the question is regarding how attackers can gain access to directory listings. I am not going the route of guessing the path of a crucial directory via URI. For this, attackers can simply use Google’s advanced search operators (now it’s obviously a prerequisite to know Google’s advanced search operators, site:, inurl:, intext:, intitle:, etc).

Note: I once again stress that neither LFY nor I are responsible for the misuse of the information given in this article. The attack techniques are meant to give you the knowledge that you need to protect your own infrastructure. You will be held solely responsible for any misuse of this knowledge.

Pentesting, Network Security and System Administration

Sunday, December 16, 2012

Path-traversal (a.k.a. directory traversal) attacks