CPSC 343 | Database Theory and Practice | Fall 2004 |
This document is intended to be a quick start to using PHP on the web - getting information about the web server environment, processing HTML forms, and working with sessions and cookies. Refer to the PHP Language Quick Start for information on the PHP language itself and integrating PHP into web pages, the PHP MySQL Quick Start for information on interacting with MySQL from PHP, or the PHP Local Quick Start for local details such as where to actually put your PHP pages so you can try them out.
Many details are omitted or simplified in these Quick Starts (e.g. not all syntax variations are presented). For more detailed and extensive information, check out one of the links below:
PHP provides access to information about the web server, the page request, and the environment in which PHP is executing via several reserved variables (known as autoglobals or superglobals).
The $_SERVER array provides access to values set by the web server. (about $_SERVER)
The following prints all of the keys and values available in the environment: (environ.php)
<html> <head> <title>Server Environment</title> </head> <body> <h1>Server Environment</h1> <ul> <?php foreach ( $_SERVER as $key => $value ) { print "<li> $key = $value\n"; } ?> </ul> </body> </html>
Form data is available to the processing script via the $_POST and $_GET arrays (depending on the method used to submit the form). The name of the form element (specified in the "name" attribute of the form element tag) is used as the key in the $_POST and $_GET arrays. (about $_POST, about $_GET)
Consider the following form: (formdemo.html)
<html> <head> <title>Form Elements Demo</title> </head> <body> <form action="formdemo.php" method="POST"> <table border=0 cellpadding=3> <tr><td>Name:</td><td><input type="text" size="20" name="name"></td></tr> <tr><td>Gender:</td> <td><input type="radio" value="M" name="gender"> male <input type="radio" value="F" name="gender"> female</td></tr> <tr><td>Year:</td> <td><select name="year"> <option value="firstyear">First Year</option> <option value="sophomore">Sophomore</option> <option value="junior">Junior</option> <option value="senior">Senior</option> </select></td></tr> <tr><td valign="top">CS courses taken:</td> <td><input type="checkbox" value="120" name="courses[]"> 120: Principles of Computer Science<br> <input type="checkbox" value="124" name="courses[]"> 124: Introduction to Programming<br> <input type="checkbox" value="225" name="courses[]"> 225: Intermediate Programming<br> <input type="checkbox" value="226" name="courses[]"> 226: Computer Architecture<br> <input type="checkbox" value="229" name="courses[]"> 229: Foundations of Computation<br> <input type="checkbox" value="324" name="courses[]"> 324: Computer Graphics<br> <input type="checkbox" value="327" name="courses[]"> 327: Data Structures and Algorithms<br> <input type="checkbox" value="331" name="courses[]"> 331: Operating Systems<br> <input type="checkbox" value="333" name="courses[]"> 333: Programming Languages<br> <input type="checkbox" value="343" name="courses[]"> 343: Databases<br> <input type="checkbox" value="371" name="courses[]"> 371: Topics<br> <input type="checkbox" value="428" name="courses[]"> 428: Program Translators<br> <input type="checkbox" value="441" name="courses[]"> 441: Networking<br> <input type="checkbox" value="453" name="courses[]"> 453: Artificial Intelligence</td></tr> </table> <input type="submit" value="submit"> <input type="reset" value="reset"> </form> </body> </html>
Of note here is that if you name checkboxes using a name of the form "something[]", PHP will automatically build an array of the selected responses.
This form is processed by formdemo.php:
<html> <head> <title>Form Elements Demo - Results</title> </head> <body> <table border=0 cellpadding=3> <tr><td>Name:</td><td><?php print $_POST["name"]; ?></td></tr> <tr><td>Gender:</td><td><?php print $_POST["gender"]; ?></td></tr> <tr><td>Year:</td><td><?php print $_POST["year"]; ?></td></tr> <tr><td valign="top">CS courses taken:</td> <td> <?php foreach ( $_POST["courses"] as $course ) { print "$course<br>\n"; } ?> </td></tr> </table> </body> </html>
Note here how the checkboxes are handled - it is just $_POST["courses"] (not $_POST["courses[]"] even though the [] were included in the name in the form - the [] is just a signal to PHP to build an array), and $_POST["courses"] is treated as an array (so foreach is used to iterate through the values).
HTTP is a stateless protocol - this means that once the web server has responded to a request, no information is retained about that request. However, it is often useful to have the concept of a "session" - that is, a series of interactions between a user and the webserver. For example, a customer at an online store will typically place items into her cart and then checkout. This may well be implemented with several separate actions - placing an element in a cart, viewing the cart, entering shipping info, entering billing info, and verifying the order - each with their own PHP page(s). Because HTML is a stateless protocol, a little work is needed to connect each step with the next (since we don't want to make the customer re-enter some identifying piece of information at every step).
PHP provides support for the session concept by providing a mechanism for defining and accessing session variables. (Internally, each user is assigned a unique session ID which is used to locate the proper set of session variables - but this is done transparently so you do not need to worry about it.) Session variables generally persist until the user closes the web browser.
A session is started with session_start(). You must call session_start() before accessing any session variables and before any output is produced (either literal HTML or output produced by PHP statements such as print), but it is safe to call this more than once (e.g. once in each page that uses or sets session variables) - it creates a session if one does not already exist, or causes the current one to be used if one already exists. (about session_start())
Session variables are accessed via the $_SESSION array. (about $_SESSION)
The first example prints out the values of any currently-set session variables: (sessionenv.php)
<?php session_start(); ?> <html> <head> <title>Session Variables</title> </head> <body> <h1>Session Variables</h1> <ul> <?php foreach ( $_SESSION as $key => $value ) { print "<li> $key = $value\n"; } ?> </ul> </body> </html>
It is interesting to load this page both before and after visiting other pages which set session variables - it will reveal what has been set.
The second example shows a page which counts how many times the user has loaded it during the current session: (pagecount.php)
<?php session_start(); ?> <html> <head> <title>Page Count</title> </head> <body> <h1>Page Count</h1> <?php if ( !session_is_registered("pagecount_count") ) { $_SESSION["pagecount_count"] = 1; } else { $_SESSION["pagecount_count"]++; } if ( $_SESSION["pagecount_count"] == 1 ) { ?> <p>Welcome to the hit-counting page! Try reloading...</p> <?php } else { ?> <p>You have loaded this page <?php print $_SESSION["pagecount_count"]; ?> times.</p> <?php } ?> </body> </html>
The third example shows how session variables can be used to convey information from one form to another. The first page (session1.html) contains only a form:
<html> <head> <title>Session Demo, Page 1</title> </head> <body> <h1>Session Demo, Page 1</h1> <p>What is your name?</p> <form action="session2.php" method="POST"> <input type="text" name="name"> <input type="submit" value="submit"> </form> </body> </html>
The second page (session2.php) stores the name submitted in form as a session variable, and displays a new form. Note that session_start() is the very first thing - it must be called before any output, and this includes plain HTML as well as anything produced by a PHP print or echo statement. The session variable can be set anywhere; it was just convenient to do it right after the session_start().
<?php session_start(); $_SESSION["name"] = $_POST["name"]; ?> <html> <head> <title>Session Demo, Page 2</title> </head> <body> <h1>Session Demo, Page 2</h1> <p>What is your quest?</p> <form action="session3.php" method="POST"> <input type="text" name="quest" size="50"> <input type="submit" value="submit"> </form> </body> </html>
The third page (session3.php) uses the quest submitted in the most recent form, as well as accessing the "name" session variable:
<?php session_start(); ?> <html> <head> <title>Session Demo, Page 3</title> </head> <body> <h1>Session Demo, Page 3</h1> <ul> <li> <p>Name: <?php print $_SESSION["name"]; ?></p> <li> <p>Quest: <?php print $_POST["quest"]; ?></p> </ul> </body> </html>
Note that if you go visit other pages and then come back to session3.php (directly, without going through the sequence from session1.html) the name will still be displayed but the quest will be lost (since it was only ever sent when the session2.php form was submitted). However, if you close your browser, the name will be lost.
If a session doesn't meet your needs (because you want information to stick around even after the browser is closed, or because you want it to expire before the browser is closed), another option is to use cookies.
Cookies are a mechanism for storing information in the user's web browser. All of the cookies set by a particular web server are automatically sent with each request to that server.
The strategy for using cookies is to set the cookie using the setcookie() function; once set, any relevant cookies will automatically be sent with every user request and can be accessed via the $_COOKIE array. Important note: since cookies are conveyed as part of the HTTP headers, setcookie() must be called before any output is produced by the script. (about setcookie(), about $_COOKIE)
For example, consider how the third session variable example above can be rewritten using cookies. The first page (cookie1.html) still just displays the form:
<html> <head> <title>Cookie Demo, Page 1</title> </head> <body> <h1>Cookie Demo, Page 1</h1> <p>What is your name?</p> <form action="cookie2.php" method="POST"> <input type="text" name="name"> <input type="submit" value="submit"> </form> </body> </html>
In the following (cookie2.php), note that the cookie is set as the very first thing - it must be set before any output, and this includes plain HTML as well as anything produced by a PHP print or echo statement. time()+60 causes the cookie to expire 60 seconds from the time when it was set; omitting the third parameter completely - setcookie("name",$_POST["name"]) - means the cookie expires when the browser is closed.
<?php setcookie("name",$_POST["name"],time()+60); ?> <html> <head> <title>Cookie Demo, Page 2</title> </head> <body> <h1>Cookie Demo, Page 2</h1> <p>What is your quest?</p> <form action="cookie3.php" method="POST"> <input type="text" name="quest" size="50"> <input type="submit" value="submit"> </form> </body> </html>
cookie3.php is nearly the same as session3.php, except that the name is retrieved from the cookie instead of from a session variable:
<html> <head> <title>Cookie Demo, Page 3</title> </head> <body> <h1>Cookie Demo, Page 3</h1> <ul> <li> <p>Name: <?php print $_COOKIE["name"]; ?></p> <li> <p>Quest: <?php print $_POST["quest"]; ?></p> </ul> </body> </html>
Because of the one minute time limit set for this cookie, you'll find that the name value is no longer set if you linger too long (more than a minute) while filling out the second form or if you try to reload the final result page after the cookie has expired. On the other hand, if you set a longer cookie lifetime (or are very quick), you can close your browser and then reload cookie3.php (directly, without going through the sequence from cookie1.html) and still see the name you entered (but not the quest, because that was never saved in a cookie).
In general, if session variables meet your needs, they should be used instead of cookies. Users sometimes disable cookies in their browsers because cookies can be a privacy risk (since they can be retrieved by the web server without the user's knowledge and can be used to track browsing behavior). Session IDs are generally implemented via cookies but a second mechanism is employed if cookies cannot be used.
There is a third mechanism, which is suitable in situations where a user fills in a series of forms (call them formA, formB, etc), and information entered in an earlier form is needed in the processing of a later form (e.g. information entered in formB is needed when processing formD). In this case, it is possible to dynamically generate (via PHP) every form from formB on and insert one or more hidden elements into the form to convey information from the processing of the current form. These hidden elements are not displayed on the page, but their values are sent to the processing script when a form is submitted, just like the values of regular form elements.
For example, consider how the session variable/cookie examples above can be rewritten using hidden form elements. (The cookie example is a case where hidden form elements can be used.) The first page (hidden1.html) is still just the first form:
<html> <head> <title>Hidden Forms Demo, Page 1</title> </head> <body> <h1>Hidden Forms Demo, Page 1</h1> <p>What is your name?</p> <form action="hidden2.php" method="POST"> <input type="text" name="name"> <input type="submit" value="submit"> </form> </body> </html>
hidden2.php generates the next form and inserts a hidden form element to store the name entered in the first form:
<html> <head> <title>Hidden Forms Demo, Page 2</title> </head> <body> <h1>Hidden Forms Demo, Page 2</h1> <p>What is your quest?</p> <form action="hidden3.php" method="POST"> <input type="hidden" name="name" value="<?php print $_POST["name"]; ?>"> <input type="text" name="quest" size="50"> <input type="submit" value="submit"> </form> </body> </html>
hidden3.php generates the final result. Though the "name" form element is not visible on-screen in the second form, its data is submitted along with the other (visible) form elements as if the user had just entered it.
<html> <head> <title>Hidden Forms Demo, Page 3</title> </head> <body> <h1>Hidden Forms Demo, Page 3</h1> <ul> <li> <p>Name: <?php print $_POST["name"]; ?></p> <li> <p>Quest: <?php print $_POST["quest"]; ?></p> </ul> </body> </html>
Hidden form elements are easy to use and work well in situations where there is a sequence of forms (formA, formB, formC, formD, etc) and you need to pass information from an earlier step to a later step. On the other hand, session variables and cookies have several advantages over hidden form elements:
They persist for some length of time, and so can be used to store information from one visit to another. (Cookies can persist even after the browser is closed.) This property can be useful for remembering user preferences or the contents of the user's shopping cart, to give a few examples.
They can be accessed from any page, so a particular sequence of form submissions isn't necessary (useful in situations such as a user browsing an online store and adding items to her cart).
The value of session variables and cookies aren't available for all to see in the HTML source of the page returned to the client (though cookies aren't ideal for storing sensitive information such as passwords because the cookie values are stored in a file on the user's computer, where it is readable by anyone with access to that file...such as spyware programs).
In some cases, you may want to redirect the user to another page - for example, if the user tries to load a page without logging in, you may want to redirect her to a login page.
Redirection is specified in the HTTP headers returned to the client along with the server's response. HTTP headers can be specified with header(); the particular header relevant for redirection is the "Location" header. (about header) header() must be called before any output is produced (whether it be echoed HTML or output produced by PHP).
The Location header requires the URL that the client should be redirected to, and most clients require an absolute URL (e.g. http://sbridgem/343/php-examples/redirect2.php instead of just redirect2.php). The following example (redirect.php) illustrates the use of header(), along with how to use information from the server environment and the rtrim() and dirname() functions to construct an absolute URL for a file in the same directory as the current script. (about rtrim, about dirname) Of course, the full URL could be hard-coded in - but then if the directory structure on the webserver changes, you'd have to modify every page.
<?php $host = $_SERVER["HTTP_HOST"]; // webserver current file is on $dir = rtrim(dirname($_SERVER["PHP_SELF"]),'/\\'); // directory current // file is in $filename = "redirect2.php"; // actual file to redirect to header("Location: http://".$host.$dir."/".$filename); ?> <html> <head> <title>Redirect</title> </head> <body> <p>Redirecting to <a href="<?php print $filename; ?>"><?php print $filename; ?></a>...</p> </body> </html>
It is a courtesy to include a link to the page that is being redirected to in the body of the document, in case the user's browser doesn't carry out the redirection automatically for some reason.
The page being redirected to can be anything - there aren't any particular requirements. For example: (redirect2.php)
<html> <head> <title>Redirect</title> </head> <body> <h2>Redirect</h2> <p>You have been redirected here.</p> </body> </html>
For efficiency, many web browsers cache pages so they don't have to send repeated requests for the same page. However, PHP scripts typically generate dynamic content which shouldn't be cached. You can force clients to not cache by including the following in each page not to be cached: (example from the PHP Manual)
<?php header("Cache-Control: no-cache, must-revalidate"); // HTTP/1.1 header("Expires: Mon, 26 Jul 1997 05:00:00 GMT"); // Date in the past ?>
As with redirection, the call to header() must occur before any output is produced (whether it be echoed HTML or output produced by PHP). (about header)
The PHP Manual summarizes the general considerations quite well:
A completely secure system is a virtual impossibility, so an approach often used in the security profession is one of balancing risk and usability. If every variable submitted by a user required two forms of biometric validation (such as a retinal scan and a fingerprint), you would have an extremely high level of accountability. It would also take half an hour to fill out a fairly complex form, which would tend to encourage users to find ways of bypassing the security.
The best security is often unobtrusive enough to suit the requirements without the user being prevented from accomplishing their work, or over-burdening the code author with excessive complexity. Indeed, some security attacks are merely exploits of this kind of overly built security, which tends to erode over time.
A phrase worth remembering: A system is only as good as the weakest link in a chain. If all transactions are heavily logged based on time, location, transaction type, etc. but the user is only verified based on a single cookie, the validity of tying the users to the transaction log is severely weakened.
When testing, keep in mind that you will not be able to test all possibilities for even the simplest of pages. The input you may expect will be completely unrelated to the input given by a disgruntled employee, a cracker with months of time on their hands, or a housecat walking across the keyboard. This is why it's best to look at the code from a logical perspective, to discern where unexpected data can be introduced, and then follow how it is modified, reduced, or amplified.
The Internet is filled with people trying to make a name for themselves by breaking your code, crashing your site, posting inappropriate content, and otherwise making your day interesting. It doesn't matter if you have a small or large site, you are a target by simply being online, by having a server that can be connected to. Many cracking programs do not discern by size, they simply trawl massive IP blocks looking for victims. Try not to become one.
The number one thing that you, as a programmer writing PHP programs, should do is to check all user-submitted data for validity - any time you use user-submitted information, consider carefully what could happen if an unexpected value is submitted. Remember - anyone browsing the web can access your PHP page.
Some examples: