Pretty URLs and PHP

We are in the middle of working on a PHP/MySQL project right now, which happens to be fairly database heavy. The application uses lots of predefined searches, filters, and sorts. We made the decision to use pretty URLs instead of GET parameters.

What does this mean? As as example, a page request that would look something like

/page.php?id=XXX&by=YYY&dir=ZZZ

gets turned into

/page/XXX/YYY/ZZZ/

This is similar in concept to a permalink, and is pretty easy to accomplish.

First, we create our page.php file. We then put a rule in .htaccess:

RewriteRule ^page(.*) page.php [L]

Don’t forget to turn on ModRewrite and set the RewriteBase if you haven’t already.

This redirects all requests that begin with "page" to page.php transparently to the user. Now, we need to parse out the URL.

The actual URL that was requested appears in $_SERVER["REQUEST_URI"]. There are a few ways to pull out the actual parameters.

One option is to use PHP’s split() function, which will return an array of the options. You would still have to validate the requested URL and the parameters in some manner, though.

We decided on using the PCRE function preg_match() to do the parsing and data validation in one step.

Lets expand our example URL, /page/XXX/YYY/ZZZ/, and assume that XXX is a numeric ID, YYY is "name", "title", or "place", and ZZZ is "asc" or "desc".  We can create a fairly simple regular expressions that covers all of the valid combinations.

We would then do something like:

preg_match("/^\/page\/(\d+)\/(name|title|place)\/(asc|desc)\/$/", $_SERVER["REQUEST_URI"], $args);

The preg_match() function is the key here. Other regular expressions could be used, but preg_match() does the comparison and stores the matches in one call, which in effect parses out the results automatically.

If $args[0] is null, then the URL didn’t match our regular expression and as a result have a malformed URL.  In this case, we can redirect to an error page, or so whatever is appropriate for the application.

If $args[0] is not null then $args[1] will contain the ID, $args[2] will contain "name", "title", or "place", and $args[3] will contain "asc" or "desc".

We can then do something similar to:

$id = $args[1];
$by = $args[2];
$dir = strtoupper($args[3]);
$sql= "SELECT * FROM my_table WHERE id=? ORDER BY ? ?";
$result = $db->query($sql, array($id, $by, $dir));

Obviously, this is a simplified example, but the basics are there. This isn’t always appropriate, especially when you can’t come up with a decent regular expression to cover all the possible inputs. However, you can combine the regular expression method with other data parsing and validation techniques to potentially simplify your code. Also, this can help provide an extra layer of protection against malicious requests when combined with bound parameters, prepared statements, and restricted privileges (as appropriate).

Keep in mind that you may need to change relative links in the page to absolute references. Building a simple page to input and display results from preg_match() is also very handy to both debug the regular expression and to get a better grasp on the results captures.

filed under: blog, resources

Add a Comment