by Marcus Whitney
When you have public forms on the web, you are always prone to attacks by those who want to use your application for their own purposes. Forums, polls, guestbooks, and blogs are some of the popular places where automated robots can be found submitting forms. The most common purpose for this automation is spam – the robots try to fill your site with links in an attempt to improve search engine rankings.
One preventive technology that has been employed by many sites and web properties such as Yahoo! is CAPTCHA. CAPTCHA, Completely Automated Public Turing test to tell Computers and Humans Apart, is a project of the Carnegie Mellon School of Computer Science. It is a technology that tries to distinguish between computers and humans and is particularly useful on the web, where the distinction is not easy to make.
According to Wikipedia, a Turing test is “a proposal for a test of a machine’s capability to perform human-like conversation.” For more information, visit http://en.wikipedia.org/wiki/Turing_test.
You have probably seen the CAPTCHA project in action at some of your web destinations. Its principle tool is the use of a randomly created image that is displayed to the user. The image contains a phrase, but there is not mention of that phrase in computer readable text on the rendered page or in the page’s source. If the form submission does not contain the correct phrase, you can safely assume that either the human made an error, or it wasn’t a human at all.
Thanks to Christian Wenz, PEAR has a package dedicated to providing these tests as security tools for your web applications. The Text_CAPTCHA package uses PHP’s GD functionality to dynamically create an image with a random phrase in a simple object-oriented interface. Before you can use Text_CAPTCHA, you must have GD installed with support for JPEGs, PNGs, and True Type fonts. For more information on this procedure, please refer to the documentation at the PHP web site: http://www.php.net/image
Text_CAPTCHA has two dependencies: Image_Text and Text_Password. Text_Password is used to generate a random phrase, and Image_Text is used to generate an image of the phrase. The process for installing Text_CAPTCHA from the command line is:
pear install Text_Password
pear install Image_Text
pear install Text_CAPTCHA
A simple and often used interface for implementing this security measure is the comment form for a blog. In this form, you typically capture the name, email address, and comment from a reader who wants to comment on what you’ve said. An example form is as follows:
When adding CAPTCHA technology to your form, you add an image and ask that the user supply the phrase:
Please enter the phrase in the image below:
<input type=”text” name=”captcha_phrase” /><br />
<img src=”captcha.jpg” />
This is where Text_CAPTCHA comes in. Prior to generating your form, you will need to implement Text_CAPTCHA:
The first line requires the inclusion of Text_CAPTCHA. The second line uses Text_CAPTCHA’s factory method to return an object from a subclass of Text_CAPTCHA. Text_CAPTCHA’s design allows for different driver types. Remember, CAPTCHA can be any technology that distinguishes humans from computers, not just images. Image is provided to the factory method, instructing Text_CAPTCHA to generate an object with the Image driver.
The third line, $captcha->init(150, 150), initializes the Text_CAPTCHA object and prepares it for use. The init method accepts two parameters – width and height. These are used to determine the dimensions of the generated image (150 by 150 pixels in this case). These parameters are optional, and the default values are 200 and 80, respectively. There are two other optional parameters – phrase (the third parameter) and options (the fourth parameter). The phrase is the text seen in the generated image, and the options allow you to control some of the aspects of Image_Text.
To specify options to send to Image_Text, create an array called $options and pass this as the fourth parameter to init:
Once init has created the image with the phrase, the rest of the process is simple – you need to create an image that can be displayed in the browser. To do this, use the getCAPTCHAAsJPEG() (to generate a JPEG) or getCAPTCHAAsPNG() (to generate a PNG) accessor method. These methods return the image to you in a buffer, and you can simply write this data to a file:
PHP 5 makes this a lot easier with its file_put_contents() function:
The next step is to get the phrase from the current object and store it in a session variable. Using the getPhrase() accessor method of Text_CAPTCHA, set a session variable to the phrase so that you can later reference it in your code:
In order to determine whether the user is a human, simply check to see whether the phrase that the user sends is equivalent to the one you saved in $_SESSION[‘captcha_phrase’]. If it is not, you can discard the data and proceed as you see fit. The following example demonstrates this assertion:
CAPTCHA can be a great way to limit the amount of successful, unwanted HTTP POST requests that your application receives. Text_CAPTCHA provides a convenient object to help you quickly and easily implement this feature. Text_CAPTCHA is currently in Alpha, but it is off to a great start and brings a touch of security consciousness to PEAR that is always appreciated. Happy CAPTCHA-ing!