I lurve me some mod_rewrite. For those who are not familiar, it is a module that lets me take URLs and rewrite them using regular expressions, turning parts of the URL into variables.
For example, instead of sending you to store.com/?category=shirts&style=longsleeve&fabric=cottton&color=red, I can send you to store.com/shirts/longsleeve/cotton/red
Nice, eh? Yeah.
Well, I’ve taken on a new project, and the server is using the infernal IIS, not apache. And I don’t have install/admin permissions. Suck. And, the existing “codebase” is nothing short of horrendous.
It has the basic setup of site.com/bios/personA.html, site.com/bios/personB.html, site.com/bios/personC.html, etc. Well, I don’t like having that much of the _same_ html. Plus, it’s all gross. So to take care of the 20 html pages, I set up a php page that calls up the individual bio info from a MySQL database. Simple, right?
Well, I didn’t want the link to be site.com/bios/?user=personA. My hack? Create a custom 404 page:
<?php
$uri = $_SERVER[’QUERY_STRING’]; // get the url
if( preg_match( “@/bios/(.+)@i”, $uri, $matches ) ) { // parse the url
$rewrite = $matches[1];
include (”bios/index.php”); // transfer control
} else {
echo “Error: could not find $uri”; // your real 404 error message
}
?>
Basically, when you go to site.com/bios/personA, the webserver naturally thinks that’s a directory, but it doesn’t exist. Luckily, the webhost allows me to specify custom error pages, so I wrote the above, which just pulls out the username (if applicable), and includes the bios index page, which then displays how I want it to be.
Dirty, but functional.
February 10th, 2007 at 10:05 am
I’ve used a similar approach for a client’s website before. However, I learned the hard way that Apache/IIS will still send a 404 HTTP response code back to the browser or search engine spider. You may want to add a line to change the HTTP code to either 200 OK or 301 Moved Permanently before the include statement, as most search engines won’t index 404′ed pages.
February 10th, 2007 at 1:35 pm
Good point Trey, updated script. Didn’t think about the response code, to be honest.
Meh, next time…