Introduction
Url Rewriting is the process of manipulating an Url or a link, which is send to a web server in such a way that the link is dynamically modified at the server to include added parameters and facts along with a server initiated redirection. The web server performs all these manipulations on the fly so that the browser is kept out of the loop concerning the change made in Url and the redirection.
Url Rewriting can advantage your websites and web based applications by providing great security, great visibility or friendliness with search Engines and helps in retention the structure of the website more easy to maintain for future changes.
In this article we will be taking a look at how we can implement Url Rewriting on an Apache based web server environment using the mod_rewrite module for Apache.
What is mod_rewrite?
Mod_rewrite is one of the most favored modules for the Apache web server and there are many web developers and administrators who will vote this module as the best thing to happen on Apache. This module has a lot of tricks up its sleeve so that it can be called the Swiss Army Knife of all Apache Modules. Apart from providing uncomplicated Url Rewriting functionality for an Apache based website, this module arms the website with great Url protection, great search machine visibility, safety against bandwidth thieves by stopping hot linking, hassle free restructuring possibilities and options to furnish friendliest of Urls for the website users. This module due to its versatility and functionality can at times feel a bit daunting to master, but getting a through comprehension of the basics can make you a specialist of the craft of Url Rewriting.
Lets Begin! - A look at all the stuff you need to have on your test environment to get mod-rewrite alive and kicking.
First and leading you should have a properly configured Apache Web Server on your test machine. Mod_rewrite is commonly installed along with the Apache server, but in case it is missing - this can be the case on a Linux machine where the mod_rewrite module was not compiled along with the premise - you will have to get it installed. For using mod_rewrite on your Apache box you will have to configure this module to load dynamically on inquire made by Apache. On a shared server you will have to palpate your web hosting company to get this module installed and loaded on Apache.
On your local machine you can find if the module is installed along with Apache by having a look at the modules directory of Apache. Check for a file named mod_rewrite.so and if it is there then the module can be made to load in to the Apache server dynamically. By default this module is not loaded when Apache starts and you need to tell Apache to enable this module for dynamic loading by making changes in the web servers configuration file, which is explained below.
How to Enable mod_rewrite on Apache?
You can make the mod_rewrite module load dynamically in to the Apache web server environment using the LoadModule Directive in the httpd.conf file. Load this file in a text editor and find a line similar to the one given below.
#LoadModule rewrite_module modules/mod_rewrite.so
Uncomment this line by removing the # and save the httpd.conf file. Restart your Apache server and if all went well mod_rewrite module will now be enabled on your web server.
Lets Rewrite our first Url using mod_rewrite Ok, now the mod_rewrite module is enabled on your server. Lets have a look at how to make this module load itself and to make it work for us.
In order to load the module dynamically you have to add a singular line to your .htaccess file. The .htaccess files are configuration files with Apache directives defined in them and they furnish distributed directory level configuration for a website. Create a .htaccess file in your web servers test directory - or any other directory on which you want to make Url Rewriting active - and add the below given line to it.
RewriteEngine on
Now we have the rewrite machine turned on and Apache is ready to rewrite Urls for you. Lets look at a sample rewrite schooling for making a request to our server for first.html redirected to second.html at server level. Add the below given line to your .htaccess file along with the RewriteEngine directive that we have added before.
RewriteRule ^first.html$ second.html
I will explicate what we have done here at the next section, but if all went well then any requests for first.html made on your server will be transferred to second.html. This is one of the simplest forms of Url Rewritting.
A point to note here is that the redirect is kept totally private from client and this differs from the excellent Http Redirects. The client or the browser is given the impression that the article of the second.html is being fetched from first.html. This enables websites to Create on the fly Urls with out the clients awareness and is what makes Url Rewriting very powerful.
Basics of mod_rewrite module
Now we know that mod_rewrite can be enabled for an entire website or a specific directory by using .htaccess file and have done a basic rewrite directive in the previous example. Here I will explicate what exactly have we done in the first sample rewrite.
Mod_rewrite module provides a set of configuration directive statements for Url Rewriting and the RewriteRule directive - that we saw in the previous sample - is the most leading one. The mod_rewrite machine uses pattern-matching substitutions for making the translations and this means a good grasp of quarterly Expressions can help you a lot.
Note: quarterly Expressions are so vast that they will not fit in to the scope of this article. I will try to write someone else article on that topic someday.
1. The RewriteRule Directive
The normal syntax of the RewriteRule is very straightforward.
RewriteRule Pattern Substitution [Flags]
The Pattern part is the pattern which the rewrite machine will look for in the incoming Url to catch. So in our first sample ^first.html$ is the Pattern. The pattern is written as a quarterly expression.
The Substitution is the replacement or translation that is to be done on the caught pattern in the Url. In our sample second.html is the Substitution part.
Flags are optional and they make the rewrite machine to do obvious other tasks apart from just doing the substitution on the Url string. The flags if present are defined with in quadrate brackets and should be separated by commas.
Lets take a look at a more complex rewrite rule. Take a look at the following Url.
yourwebsiteurl/articles.php?category=stamps&id=122
Now we will change the above Url in to a search machine and user friendly Url like the one given below.
yourwebsiteurl/articles/stamps/122
Create a page called articles.php with the following code:
$category = $_Get['category'];
$id = $_Get['id'];
echo "Category : " . $category . " ";
echo "Id : " . $id;
This page simply prints the two Get variables passed to it on the webpage.
Open the .htaccess file and write in the below given Rule.
RewriteEngine on
RewriteRule ^articles/(w+)/([0-9]+)$ /articles.php?category=&id=$2
The pattern ^articles/(w+)/([0-9]+)$ can be bisected as:
^articles/ - checks if the request starts with 'articles/'
(w+)/ - checks if this part is a singular word followed by a forward slash. The parenthesis is used for extracting the parameter values, which we need for replacing in the actual query string, in the supplanted Url. The pattern, which is settled in parenthesis will be stored in a extra changeable which can be back-referenced in the substitution part using variables like , $2 so on for each pair of parenthesis.
([0-9]+)$ - this checks for digits at the last part of the url.
Try requesting the articles.php file in your test server with the below given url.
yourwebsiteurl/articles/coins/1222
The Url Rewrite rule you have written will kick in and you will be finding the consequent as if the url requested where:
yourwebsiteurl/articles.php?category=coins&id=1222
Now you can work on this sample to build more and more complex Url Rewritting rules. By using Url rewriting in the above example we have achieved a search machine and user friendly Url, which is also tamper proof against casual script kiddie injection sort of attacks.
What does the Flags parameter of RewriteRule directive do?
RewriteRule flags furnish us with a way to control the way mod_rewrite handles each rule. These flags are defined inside a tasteless set of quadrate brackets separated by commas and there are about 15 flags to choose from. These flags range from those which controls the way rules are interpreted to complex one's like those which sent specific Http headers back to the client when a match is found on the pattern.
Lets look at some of the basic flags.
- [Nc] flag (nocase) -. This makes mod_rewrite to treat the pattern in a case-insensitive manner.
- [F] flag (forbidden) - This makes Apache send a forbidden Http response header - response 403 - back to the client.
- [R] flag (redirect) - This flag makes mod_rewrite to use a formal Http redirect instead of the internal Apache redirect. You can use this flag to forewarn the client about the redirection and this flag sends a Moved Temporarily - Response 302 - by default, but this flag takes an extra parameter, which you can use to modify the response code. If you wish to send a response code of 301 - Moved constantly - then this flag can be written as [R=301]
- [G] flag (gone) - This flag makes Apache riposte with a Http Response 410 - File Gone.
- [L] flag (last) - This makes mod_rewrite to stop processing succeeding directives if the current directive is successful.
- [N] flag (next) - This flag makes the rewrite machine to stop process and loop back to start of the rule list. A point to note is that the Url, which will be used for pattern matching, will be the rewritten one. This flag can Create an endless loop and so extreme care should be given while using it.
There are other flags too but they are complex to explicate with in the scope of this article so you can find more info on them by referring the mod_rewrite manual.
2. The RewriteCond Directive
This directive gives you the added power of conditional checking on a range of parameters and conditions. This statement when combined with RewriteRule will let you rewrite Urls based on the success of conditions. RewriteCond are like the if() statement in your programming language but here they are for choosing whether a RewriteRule directive's substitution should take place or not. Things like preventing hot linking and checking whether the client meets obvious criteria's before rewriting the Url etc can be achieved by using this directive.
The normal syntax of the RewriteCond is:
RewriteCond string-to-test condition-pattern
The string-to-test part of the RewriteCond has way to a large set of Variables like the Http Header variables, request Variables, Server Variables, Time variables etc so you can do a lot of complex conditional checking while writing directives. You can use any of these variables as a string to test by putting it in a %string format. Suppose you want to use the Http_Referer changeable then it can be used as %Http_Referer .
The health part can be a uncomplicated string or a very complex quarterly expression as your imagination is the only limit with this module.
Lets take a look at an example for conditional rewriting using RewriteCond directive:
RewriteCond %Http_User_Agent ^Mozilla/4(.*)Msie
RewriteRule ^index.html$ /index.ie.html [L]
RewriteCond %Http_User_Agent ^Mozilla/5(.*)Gecko
RewriteRule ^index.html$ /index.netscape.html [L]
RewriteRule ^index.html$ /index.other.html [L]
This example uses the Http_User_Agent as the test string with the RewriteCond directive. What it does is that it uses the Http_User_Agent header changeable to find the browser of the visiting user and match it against a set of pre known values to detect the browser and serve different pages to the visitor based on the match result. The first RewriteCond checks the Http_User_Agent to find a match for the ^Mozilla/4(.*)Msie pattern. This match will occur when a user visits the page using Ie as browser. Then the RewriteRule given just under that statement will kick in and will rewrite the Url to server index.ie.html page to the Ie visitor.
Similarly a checking is made for mozilla specific browsers in the second RewriteCond and the RewriteRule will do the substitution for index.netscape.html when a obvious match is made on the ^Mozilla/5(.*)Gecko pattern. The third RewriteRule is there to catch other browsers. If both the first and second RewriteCond fails then the last RewriteRule will be considered. A point to note in the above example is the usage of the [L] flag with all the RewriteRule directives. This is used to avoid the cascading of applying the rules when a obvious RewriteRule is applied.
Two flags which can be used to added control the way the RewriteCond directive behave are [Nc] - case-insensitive - and [Or] - chaining of many RewriteCond directives with logical Or.
By using these two directives - RewriteRule and RewriteCond - you can implement a lot of powerfull Url Rewriting functionality on your website.
Other mod_rewrite Directives
- RewriteBase Directive - This directive can solve the problem of RewriteRule creating non-existent Urls due to inequity in the bodily file system structure on web server and the structure of website Urls. Setting this directive to the below given statement can solve this problem. RewriteBase /
- RewriteMap Directive- This directive is very considerable as it allows you to map unique values to a set of other replacement values from a table and to use it in the substitution to Create on the fly Urls. This can be especially beneficial for huge e-commerce or Cms kind of applications where you need to replace each section name or kind name in the Url with a corresponding id taken from a database.
- RewriteLog Directive - This directive can be used to set the log file that the mod_rewrite machine will use to log all the actions taken during processing on client requests. The syntax is: RewriteLog /path/to/logfile This directive should be defined in the httpd.conf file as this directive is applied on a per-server basis.
- RewriteLogLevel Directive- This directive tells mod_rewrite module the whole of facts on the internal processing done while rewriting Urls to be logged. This directive takes values from 0 to 9 where 0 means no logging and 9 means all the facts is logged. A higher level of logging can make Apache run slow, so a level above 2 is desired only for debugging purposes. This directive can be applied using the below given syntax.br/> RewriteLogLevel levelnumber
Conclusion
In this article we have taken only a brief look at the power of the mod_rewrite module. It is only a scratch on the outside but I hope it is sufficient to get you started on using this module on your web server environment.
You can read about the system and benefits of Url Rewriting from my previous article, which can be accessed from here.
How to Use Mod-Rewrite to Simplify Url Rewriting in Apache - A Basic Guide to the Mod-Rewrite ModuleChelsea FC Goals Gauge Absolute Pressure Wireless Network Card Desktop