CPSC 343 | Database Theory and Practice | Fall 2008 |
This document is intended to be a quick start to using PHP to connect to MySQL. Refer to the PHP Language Quick Start for information on the PHP language itself and integrating PHP into web pages, the PHP Web Quick Start for information on getting information about the web server environment, processing HTML forms, and working with sessions and cookies, or the PHP Local Quick Start for local details such as where to actually put your PHP pages so you can try them out.
Many details are omitted or simplified in these QuickStarts (e.g. not all syntax variations are presented). For more detailed and extensive information, check out one of the links below:
This Quick Start describes using the mysqli (MySQL Improved Extension) for connecting to MySQL.
There are four basic steps for connecting to a database:
Establish a connection to the database.
This is done with the mysqli_connect function, which takes the database host, username, password, and database to use as parameters. (about mysqli_connect) For example:
<?php $link = mysqli_connect("sbridgem.hws.edu","guest","guestpass","sailors"); ?>
The return value (stored in $link) can be tested to determine whether or not the connection was successfully established. (See the full example below for how to do this.) The connection can fail if there is no MySQL server on the named host, if the username or password are not correct or the user doesn't have access to the named database, or if the database doesn't exist.
Send the query to the database.
This is done with the mysqli_query function, which takes the database connection and the query to execute as parameters. (about mysqli_query) For example:
<?php $query = "SELECT S.Sid,S.Sname,B.Bid,B.Bname,R.Day FROM BOAT B ". "NATURAL JOIN RESERVATION R NATURAL JOIN SAILOR S ". "ORDER BY S.Sid ASC, B.Bid ASC,R.Day ASC"; $result = mysqli_query($link,$query); ?>
The return value (stored in $result) can be tested to determine whether or not the query was successful. (See the full example below for how to do this.) The query can fail if the query contains a syntax error, the named tables and/or columns don't exist, or if the user doesn't have permission to carry out the query.
Process the results.
There are a number of functions which retrieve information about the results of the query. (about mysqli_fetch_assoc) The following example uses mysqli_fetch_assoc to retrieve rows of the result table and format them using an HTML table:
<table border=1 cellpadding=2> <tr> <th>Sailor ID</th> <th>Sailor Name</th> <th>Boat Name</th> <th>Boat ID</th> <th>Reservation Date</th> </tr> <?php // print results while ( $row = mysqli_fetch_assoc($result) ) { print "<tr>"; print "<td>".$row["Sid"]."</td>"; print "<td>".$row["Sname"]."</td>"; print "<td>".$row["Bname"]."</td>"; print "<td>".$row["Bid"]."</td>"; print "<td>".$row["Day"]."</td>"; print "</tr>\n"; } ?> </table>
Each call to mysqli_fetch_assoc returns the next row in the result table, and the return value evaluates to false if there are no more rows to retrieve. () The row is returned as an array indexed by the column names of the result table.
Free the query results and close the connection.
mysqli_free_result frees the result (only necessary if results were returned e.g. for SELECT queries but not INSERT queries), and mysqli_close closes the connection: (about mysqli_free_result, about mysqli_close)
<?php mysqli_free_result($result); mysqli_close($link); ?>
All of these elements come together, along with error-checking, in allreservations.php (which displays all of the reservation information in the sailors database):
<html> <head> <title>Sailor Reservations</title> </head> <body> <h1>Sailor Reservations</h1> <?php // establish connection $link = mysqli_connect("sbridgem.hws.edu","guest", "guestpass","sailors"); /* dangerous! - password is stored in plain text */ // check connection if ( !$link ) { ?> <blockquote> <p><b>Connection failed.</b><br> <?php print mysqli_connect_error(); ?></p> <!-- dangerous! - error message can reveal information of use to an attacker --> </blockquote> <?php } else { // connection is good... $query = "SELECT S.Sid,S.Sname,B.Bid,B.Bname,R.Day FROM BOAT B ". "NATURAL JOIN RESERVATION R NATURAL JOIN SAILOR S ". "ORDER BY S.Sid ASC, B.Bid ASC,R.Day ASC"; $result = mysqli_query($link,$query); // check success if ( !$result ) { ?> <blockquote> <p><b>Error executing query.</b><br> <?php print mysqli_error($link); ?></p> /* dangerous! - error message can reveal information of use to an attacker */ </blockquote> <?php } else { // format the results nicely in a table ?> <table border=1 cellpadding=2> <tr> <th>Sailor ID</th> <th>Sailor Name</th> <th>Boat Name</th> <th>Boat ID</th> <th>Reservation Date</th> </tr> <?php // print results while ( $row = mysqli_fetch_assoc($result) ) { print "<tr>"; print "<td>".$row["Sid"]."</td>"; print "<td>".$row["Sname"]."</td>"; print "<td>".$row["Bname"]."</td>"; print "<td>".$row["Bid"]."</td>"; print "<td>".$row["Day"]."</td>"; print "</tr>\n"; } ?> </table> <?php // free result when we are done mysqli_free_result($result); } // close link mysqli_close($link); } ?> </body> </html>
(about mysqli_connect_error, about mysqli_error)
User-supplied information (such as what comes from a form) can be integrated into the query - just concatenate the appropriate values into the appropriate query text. Keep in mind that all non-integer values must be surrounded by single quotes (and you shouldn't make the user enter the quotes). For example, to look for reservations involving a sailor named in a HTML form box named "sailor":
$query = "SELECT B.Bname,B.Bid,R.Day FROM BOAT B NATURAL JOIN ". "RESERVATION R NATURAL JOIN SAILOR S WHERE S.Sname='". $_POST["sailor"]."' ORDER BY B.Bid ASC, R.Day ASC";
Pay close attention to the single quotes following S.Sname= and just before ORDER BY - these surround the sailor name value taken from the form.
The following (reservationquery.html) presents a form where the user can enter a username and password for connecting to the database as well as the name of the particular sailor of interest:
<html> <head> <title>Sailor Reservation Query</title> </head> <body> <h1>Sailor Reservation Query</h1> <form action="reservationquery.php" method="POST"> <table border=0 cellpadding=3> <tr><td>Username:</td><td><input type="text" name="username"></td></tr> <tr><td>Password:</td><td><input type="password" name="password"></td></tr> <tr><td>Display reservations for:</td> <td><input type="text" name="sailor"> (sailor name)</td></tr> </table> <input type="submit" value="get reservations"> <input type="reset" value="clear"> </form> </body> </html>
reservationquery.php executes the query and displays the results:
<html> <head> <title>Query Results</title> </head> <body> <h1>Query Results</h1> <p>Reservations for <?php print $_POST["sailor"]; ?>:</p> <?php // establish connection $link = mysqli_connect("sbridgem.hws.edu",$_POST["username"], $_POST["password"],"sailors"); // check connection if ( !$link ) { ?> <blockquote> <p><b>Connection failed.</b><br> <?php print mysqli_connect_error(); ?></p> <!-- dangerous! - error message can reveal information of use to an attacker --> </blockquote> <?php } else { // connection is good... $query = "SELECT B.Bname,B.Bid,R.Day FROM BOAT B NATURAL JOIN ". "RESERVATION R NATURAL JOIN SAILOR S WHERE S.Sname='". $_POST["sailor"]."'"; /* dangerous! - user input is used without any validity checking */ $result = mysqli_query($link,$query); // check success if ( !$result ) { ?> <blockquote> <p><b>Error executing query.</b><br> <?php print mysqli_error($link); ?></p> <!-- dangerous! - error message can reveal information of use to an attacker --> </blockquote> <?php } else { // format the results nicely in a table ?> <table border=1 cellpadding=2> <tr><th>Boat Name</th><th>Boat ID</th><th>Reservation Date</th></tr> <?php // print results while ( $row = mysqli_fetch_assoc($result) ) { print "<tr>"; print "<td>".$row["Bname"]."</td>"; print "<td>".$row["Bid"]."</td>"; print "<td>".$row["Day"]."</td>"; print "</tr>\n"; } ?> </table> <?php // free result when we are done mysqli_free_result($result); } // close link mysqli_close($link); } ?> </body> </html>
When you start pasting user input into queries, you run the risk of malformed queries and a more serious problem known as SQL Injection in which the user can gain unauthorized access to the database.
Consider the reservation-retrieval example from the previous section. If the user enters the name Dustin, all is well - $_POST["sailor"] has the value Dustin and substituting that into the query yields
SELECT B.Bname,B.Bid,R.Day FROM BOAT B NATURAL JOIN RESERVATION R NATURAL JOIN SAILOR S WHERE S.Sname='Dustin'
But what if the user enters D'Anna as the sailor name? Substituting that into the query yields
SELECT B.Bname,B.Bid,R.Day FROM BOAT B NATURAL JOIN RESERVATION R NATURAL JOIN SAILOR S WHERE S.Sname='D'Anna'
This is no longer legal because the apostrophe in "D'Anna" matches with the first single quote surrounding the sailor name.
This query will fail with no harm done, but what if the user is more clever about what they enter for the sailor's name? For example, consider what happens if the user enters Dustin'; DELETE FROM RESERVATION WHERE Sid<>' instead? Substituting that in yields
SELECT B.Bname,B.Bid,R.Day FROM BOAT B NATURAL JOIN RESERVATION R NATURAL JOIN SAILOR S WHERE S.Sname='Dustin'; DELETE FROM RESERVATION WHERE Sid<>''
These are two perfectly legal queries...and not what the person writing the reservation query page had in mind! (Imagine if a table storing passwords was substituted for RESERVATION...)
Now, this particular case is not a problem with mysqli_query() because it does not permit more than one query to be executed in a single function call. However, what if the user instead enters Dustin' UNION SELECT 'Swift',S.Sid,'1900-01-01' FROM SAILOR S WHERE S.Sid<>' for the sailor name? Substituting that value in yields
SELECT B.Bname,B.Bid,R.Day FROM BOAT B NATURAL JOIN RESERVATION R NATURAL JOIN SAILOR S WHERE S.Sname='Dustin' UNION SELECT 'Swift',S.Sid,'1900-01-01' FROM SAILOR S WHERE S.Sid<>''
This is a single legal query which carries out the intended reservations-for-Dustin query but also includes information about all of the sailor IDs in the database. (Imagine if a table storing passwords was accessed instead of SAILOR...)
Similar attacks are possible with UPDATE or DELETE queries, where an attacker could adjust the WHERE conditions.
There are two drawbacks with the database-access methods prevented so far - efficiency and security.
First, each query sent to the database with mysqli_query must be parsed and processed separately. This requires a lot of duplication of effort if you are sending several similar queries e.g. first asking for all of the reservations involving Dustin, then Horatio, then Lubber, then... The only thing different about these queries is the name of the sailor.
Second, the database is vulnerable to SQL injection attacks. There are things you can do to process the user input to make it safer with mysqli_query, but there's another alternative...
Prepared statements address both of these concerns by replacing the single step of calling mysqli_query with two steps:
Only the second step needs to be repeated to send multiple similar queries. And, because placeholders can only be substituted for certain elements of the query and are treated specially, SQL injection is no longer a problem.
Let's consider how to use prepared statements to rewrite the reservation page: (error-checking is omitted)
Establish a connection to the database.
This is the same as before:
$link = mysqli_connect("sbridgem.hws.edu",$_POST["username"], $_POST["password"],"sailors");
Define the query, using question marks as placeholders for values that will changed from query to query or where user input will be plugged in.
$query = "SELECT B.Bname,B.Bid,R.Day FROM BOAT B NATURAL JOIN ". "RESERVATION R NATURAL JOIN SAILOR S WHERE S.Sname=?";
You don't have to create a separate variable for the query string - you could pass it directly to mysqli_prepare - but it is easier to read this way.
Send the query to the database.
This is done with the mysqli_prepare function, which takes the database link and query as parameters. (about mysqli_prepare) For example:
$stmt = mysqli_prepare($link,$query);
Set values for the placeholders.
This is done with the mysqli_stmt_bind_param function, whose first parameter is the statement, second parameter is a string indicating the type of each parameter and whose other parameters are the values (in order) to substitute for placeholders. (about mysqli_stmt_bind_param) For example:
mysqli_stmt_bind_param($stmt,'s',$_POST["sailor"]);
The second parameter is 's' because the sailor name is a string type (s for string) and there is only one placeholder.
Execute the query.
This is done with the mysqli_stmt_execute function. (about mysqli_stmt_execute) For example:
mysqli_stmt_execute($stmt);
Process the results.
This is done by linking variables to the columns of the result table with mysqli_stmt_bind_result, and then repeatedly retrieving one row of the result with mysqli_stmt_fetch. The variables you've bound will be assigned the appropriate values from the row each time. (about mysqli_stmt_bind_result, about mysqli_stmt_fetch) For example:
<table border=1 cellpadding=2> <tr><th>Boat Name</th><th>Boat ID</th><th>Reservation Date</th></tr> <?php mysqli_stmt_bind_result($stmt,$bname,$bid,$day); while ( mysqli_stmt_fetch($stmt) ) { print "<tr>"; print "<td>".$bname."</td>"; print "<td>".$bid."</td>"; print "<td>".$day."</td>"; print "</tr>\n"; } ?> </table>
You don't have to name your variables the same as each column, but it makes it easier to keep track of them if you do. Note that the parameters to bind_result are in the same order as in the table returned by the query.
Close the statement and database connection.
mysqli_stmt_close closes the statement. (about mysqli_stmt_close, about mysqli_close)
<?php mysqli_stmt_close(); mysqli_close($link); ?>
Only close the statement when you are done with it - if you are going to execute several more similar queries, repeat steps 4-6 for each additional query before closing the statement.
The full example, without error-checking: (reservationquery2.php, driven by reservationquery2.html)
<html> <head> <title>Query Results</title> </head> <body> <h1>Query Results</h1> <p>Reservations for <?php print $_POST["sailor"]; ?>:</p> <?php // establish connection $link = mysqli_connect("sbridgem.hws.edu",$_POST["username"], $_POST["password"],"sailors"); $query = "SELECT B.Bname,B.Bid,R.Day FROM BOAT B NATURAL JOIN ". "RESERVATION R NATURAL JOIN SAILOR S WHERE S.Sname=?"; $stmt = mysqli_prepare($link,$query); mysqli_stmt_bind_param($stmt,'s',$_POST["sailor"]); mysqli_stmt_execute($stmt); // format the results nicely in a table ?> <table border=1 cellpadding=2> <tr><th>Boat Name</th><th>Boat ID</th><th>Reservation Date</th></tr> <?php mysqli_stmt_bind_result($stmt,$bname,$bid,$day); while ( mysqli_stmt_fetch($stmt) ) { print "<tr>"; print "<td>".$bname."</td>"; print "<td>".$bid."</td>"; print "<td>".$day."</td>"; print "</tr>\n"; } ?> </table> <?php mysqli_stmt_close($stmt); // close link mysqli_close($link); ?> </body> </html>
Should you use prepared statements for every query? No - if you only send a particular query once and it doesn't include user input, you've increased the time it takes to execute the query because now you have two communications with the DBMS (sending the query, and sending the placeholder values) instead of one. (You've also lost efficiency if you only send any query once, even if it involves user input - but the protection against SQL injection is worth it then.) In short, use mysqli_query for single queries not involving user input and prepared statements if you are going to issue multiple similar queries or if the query involves user input.
Since things can go wrong at nearly even step of the communication with the database (illegal query syntax, bad username/password, DBMS isn't accepting connections, etc), error-checking can quickly turn into a massive tangle.
Exceptions can be used to cut through this tangle. The idea is that if an error occurs early in the process of connecting to the database and sending a query, all subsequent steps should be skipped. The solution: put this whole process of connecting and sending queries into a try-catch block, and throw an exception if an error occurs. The exception handler displays an error message. Note that the cleanup steps (closing the statement and the link) should be done after the try-catch, because you'll need to do them if you get even partway through the connecting-and-sending process - but they are protected so they aren't executed if the link or statement didn't get opened in the first place.
Here's a third version of the reservations page (reservationquery3.php, driven by reservationquery3.html), with error-handling. Note that since we want to do the same thing for every error (log the message to the webserver's error log and throw an exception), a function error is defined to avoid repeated code. $errlog is an internal detailed message (useful to developers but not users), while $errmsg is the error message to be displayed to the user.
<?php function error ( $errlog, $errmsg ) { error_log("[".$_SERVER["PHP_SELF"]."] error: (".$errmsg.") ". $errlog); throw new Exception($errmsg); } ?> <html> <head> <title>Query Results</title> </head> <body> <h1>Query Results</h1> <p>Reservations for <?php print $_POST["sailor"]; ?>:</p> <?php try { // establish connection $link = mysqli_connect("sbridgem.hws.edu",$_POST["username"], $_POST["password"],"sailors"); // check connection if ( !$link ) { error(mysqli_connect_error(),"Could not connect to database."); } // connection is good... $query = "SELECT B.Bname,B.Bid,R.Day FROM BOAT B NATURAL JOIN ". "RESERVATION R NATURAL JOIN SAILOR S WHERE S.Sname=?"; $stmt = mysqli_prepare($link,$query); if ( mysqli_errno($link) || !$stmt ) { error(mysqli_error($link),"Error executing query."); } mysqli_stmt_bind_param($stmt,'s',$_POST["sailor"]); if ( mysqli_stmt_errno($stmt) ) { error(mysqli_stmt_error($stmt),"Error executing query."); } mysqli_stmt_execute($stmt); if ( mysqli_stmt_errno($stmt) ) { error(mysqli_stmt_error($stmt),"Error executing query."); } // format the results nicely in a table ?> <table border=1 cellpadding=2> <tr><th>Boat Name</th><th>Boat ID</th><th>Reservation Date</th></tr> <?php mysqli_stmt_bind_result($stmt,$bname,$bid,$day); if ( mysqli_stmt_errno($stmt) ) { error(mysqli_stmt_error($stmt),"Error retrieving results."); } while ( mysqli_stmt_fetch($stmt) ) { if ( mysqli_stmt_errno($stmt) ) { error(mysqli_stmt_error($stmt),"Error retrieving results."); } print "<tr>"; print "<td>".$bname."</td>"; print "<td>".$bid."</td>"; print "<td>".$day."</td>"; print "</tr>\n"; } ?> </table> <?php } catch ( Exception $e ) { ?> <blockquote> <p><?php print $e->getMessage() ?></p> </blockquote> <?php } if ( $stmt ) { mysqli_stmt_close($stmt); } if ( $link ) { mysqli_close($link); } ?> </body> </html>
For security reasons, you should avoid revealing information about the database schema in error messages. (Some attacks require knowing something about the database schema such as column names, column types, table names, etc.) This means that things like show_source(), mysqli_connect_error(), and mysqli_error() - which are very useful for debugging - should not be present in a final production website. (about show_source, about mysqli_connect_error, about mysqli_error) Instead, a production website should use a mechanism like error_log to log a detailed error message in a developer-accessible place while presenting only a "safe" generic message to the user. (about error_log)
The placeholders you put into prepared statements can't go anywhere - they can only substitute for values, not table or attribute names. Also, prepared statements can't be used for every SQL statement (though the ones you are likely to use - SELECT, INSERT, UPDATE, DELETE - are supported). That means that you may need to include user-supplied input in a query and can't use a prepared statement to deal with malformed queries and to protect against SQL injection.
Certain characters, like single quotes, have a special meaning to MySQL - but also may occur within the value of an attribute. For example, consider the query for finding reservations for a given sailor:
<?php $query = "SELECT B.Bname,B.Bid,R.Day FROM BOAT B NATURAL JOIN ". "RESERVATION R NATURAL JOIN SAILOR S WHERE S.Sname='". $_POST["sailor"]."'"; ?>
As seen above, there's no problem if the user enters the name Dustin but there is if the name is D'Anna:
SELECT B.Bname,B.Bid,R.Day FROM BOAT B NATURAL JOIN RESERVATION R NATURAL JOIN SAILOR S WHERE S.Sname='D'Anna'
The solution is to escape any potentially special characters in the user input so that they are treated as literals rather than as special characters. Exactly what is special depends on the database; for MySQL, the function mysqli_real_escape_string() does the appropriate escaping. (about mysqli_real_escape_string) Thus, a better way to write the PHP line above is the following:
<?php $sailor = mysqli_real_escape_string($link,$_POST["sailor"]); $query = "SELECT B.Bname,B.Bid,R.Day FROM BOAT B NATURAL JOIN ". "RESERVATION R NATURAL JOIN SAILOR S WHERE S.Sname='".$sailor."'"; ?>
It is not necessary to create a new $sailor variable, but doing so makes the second line more readable.
The special symbols % and _ occur in LIKE conditions. They aren't, however, escaped by mysqli_real_escape_string and thus will be interpreted as special characters instead of literals if they appear in user input. Use addcslashes to escape these characters: (about addcslashes)
<?php $sailor = addcslashes(mysqli_real_escape_string($link,$_POST["sailor"]), "%_"); $query = "SELECT B.Bname,B.Bid,R.Day FROM BOAT B NATURAL JOIN ". "RESERVATION R NATURAL JOIN SAILOR S WHERE S.Sname LIKE '".$sailor."%'"; ?>
The first parameter to addcslashes is the string to be processed; the second parameter is a string containing the character to be escaped. addcslashes must be done after mysqli_real_escape_string so that mysqli_real_escape_string won't try to escape the backslashes that addcslashes adds. This particular query finds reservations for all sailors whose name starts with the value the user entered in the form.
Tip #1: Always properly escape user input.
The problem of the user entering Dustin' UNION SELECT 'Swift',S.Sid,'1900-01-01' FROM SAILOR S WHERE S.Sid<>' for the sailor name can be avoided by first escaping the input:
<?php $sailor = mysqli_real_escape_string($link,$_POST["sailor"]); $query = "SELECT B.Bname,B.Bid,R.Day FROM BOAT B NATURAL JOIN ". "RESERVATION R NATURAL JOIN SAILOR S WHERE S.Sname='".$sailor."'"; ?>
The result is then a search for reservations for the sailor whose name is Dustin' UNION SELECT 'Swift',S.Sid,'1900-01-01' FROM SAILOR S WHERE S.Sid<>'...which doesn't match any sailor.
A related example involves numeric input. Consider searching for reservations by sailor ID:
<?php $sailor = mysqli_real_escape_string($link,$_POST["sailor"]); $query = "SELECT B.Bname,B.Bid,R.Day FROM BOAT B NATURAL JOIN ". "RESERVATION R NATURAL JOIN SAILOR S WHERE S.Sid=".$sailor; ?>
Note that the single quotes are left off because $sailor is a numeric value. Even properly escaped, this is vulnerable to a similar attack - just enter 22 UNION SELECT 'Swift',S.Sid,'1900-01-01' FROM SAILOR S WHERE S.Sid<> for the sailor ID to reveal all of the sailor IDs (in addition to the reservations for sailor 22).
A solution in this case is to use single quotes around every bit of user input (it is legal to use single quotes for numbers):
<?php $sailor = mysqli_real_escape_string($link,$_POST["sailor"]); $query = "SELECT B.Bname,B.Bid,R.Day FROM BOAT B NATURAL JOIN ". "RESERVATION R NATURAL JOIN SAILOR S WHERE S.Sid='".$sailor."'"; ?>
In the reservations-by-sailor-ID example, putting single quotes around the numeric $sailor value and properly escaping it will result in a query returning an empty table...taking up database time as the query is carried out. This can be avoided by having PHP check that a numeric value is, in fact, numeric using the is_numeric function: (about is_numeric)
<?php if ( is_numeric($sailor) ) { $query = "SELECT B.Bname,B.Bid,R.Day FROM BOAT B NATURAL JOIN ". "RESERVATION R NATURAL JOIN SAILOR S WHERE S.Sid='".$sailor."'"; } else { // error message } ?>
Tip #2: Always check the type of numeric input.
Another denial-of-service attack can come from the % and _ wildcards used with LIKE. If not escaped, MySQL will treat them as wildcards. This can be an issue because the placement of % and _ in the LIKE condition has an effect on how quickly a query can be processed - allowing a user control over wildcards means a malicious user could choose a placement designed to slow down the database server. The solution is to also escape all % and _ symbols (using addcslashes) when user input is involved in a LIKE condition.