Converting Links to Query Strings

Remember just a bit ago when I said that I was passing more than just updated-max and max-results in the query string? Here is why: if I kept the structure of /year/month in the URL, I would need to actually create folders on the server for every single combination of the two and have an index.php file there to handle posts for that specific month. Instead I opted to pass the year and month as parameters to index.php and just trick your web browser into thinking that you are at the folder /year/month. There is a nifty way to do this using Apache’s mod_rewrite package.

RewriteEngine On
RewriteCond %{REQUEST_URI} ^/[0-9]+/[0-9]+/?$
RewriteRule ^/([0-9]+)/([0-9]+)/?$ /index.php?year=$1&month=$2 [L]

Rewrite is a very powerful tool that allows the webserver to change browser requests on the fly. The way to do this is to first provide a condition for which a rule will act. If this rule is met then you provide a regular expression which will find and replace text. For the example above, the condition to be met is when the REQUEST_URI contains two groups of numbers separated by a slash and possibly ending with one as well. If that is satisfied then the rule will take the numbers between the slashes and add it to a query string. The [L] at the end is to tell the server not to apply any more rules after this one. Visiting edward.p101/2019/07 will be treated by the server as edward.p101/index.php?year=2019&month=07. If you don’t believe me then visit both links; they are exactly the same!

While this does make the website look nicer, there is a more important reason: every link on every page is hardcoded to use this format with slashes. Instead of manually rewriting each link to use a query string or putting index.php files in every location it is much easier to setup a bunch of Rewrite statements in the site’s virtual hosts file. This is what that file looked like by the time I was done covering all of the possibilities:

The first two sets of Rewrites above are for controlling the results that are returned when clicking on the Newer/Older Post links at the bottom of a list. I’ll touch more on these links in a future article but lists are collections of posts under categories such as Year, Month, and Label. When only “search” is in the URI and no label is provided then the query string is vetted. If the query string starts with “updated-max=” and has at least one character followed by “&max-results=” and at least one number it moves on to the rule. The rule changes what the user sees to “/search” and adds the query string (as indicated with [L,QSA] at the end). The following block covers similar logic but for the case where a label is indicated.

Pages should be shown with a “/p” in the URI which is shown in the third rewrite block. All the characters within the square brackets indicate what is allowed in this section. Dashes indicate ranges of characters, so 0-9 is all of the numbers, a-z for lower case letters, and A-Z for capital ones. The underscore and dash at the end allow for dashes to appear in the sequence. The + outside the square brackets allows there to be one or more characters covered within. The ^ at the front means the beginning of the string and a $ delimits the end. The final three rewrite blocks are as described in the example at the beginning of this article.

I spent a lot of time troubleshooting and arrived at the solution faster after raising the logging level of the Rewrite module. Here are some resources that helped me greatly along the way:

Leave a Comment

Your email address will not be published. Required fields are marked *