.htaccess Magic! - Part II
OutFront News
Article: June, 2002
Well, first another boring bit! To prevent people from being able to see the
contents of your .htaccess file, you need to place the following code in the
file:
<Files .htaccess>
order allow,deny
deny from all
</Files>
Be sure to format that just as it is above, with each line on a new line as
shown. There is every likelihood that your existing .htaccess file, if you have
one, includes those lines already.
Magic Trick No. 1: Redirect to Files or Directories
You have just finished a major overhaul on your site, which unfortunately
meant you have renamed many pages that have already been indexed by search
engines, and quite possibly linked to or bookmarked by users. You could use a redirect
meta tag in the head of the old pages to bring users to the new ones, but some
search engines may not follow the redirect and others frown upon it.
.htaccess leaps to the rescue!
Enter this line in your .htaccess file:
Redirect permanent /oldfile.html http://www.domain.com/filename.html
You can repeat that line for each file you need to redirect. Remember to
include the directory name if the file is in a directory other than the root
directory:
Redirect permanent /olddirectory/oldfile.html http://www.domain.com/newdirectory/newfile.html
If you have just renamed a directory you can use just the directory name:
Redirect permanent /olddirectory
http://www.domain.com/newdirectory
(Note: The above commands should each be on a
single line, they may be wrapping here but make sure
they are on a single line when you copy them into your file.)
This has the added advantage of preventing the increasing problem on the
Internet, as people change their sites, of 'link rot'. Now people who have
linked to pages on your site will still have functioning links, even if the
pages have changed location.
Magic Trick No. 2: Change the Default Directory Page
In most cases the default directory page is index.htm or index.html. Many
servers allow a range of pages called index, with a variety of extensions, to be
the default page.
Suppose though (for reasons of your own) you wish a page called honeybee.html
or margarine.html to be a directory home page?
No problem. Just put the following line in your .htaccess file for that
directory:
DirectoryIndex honeybee.html
You can also use this command to specify alternatives. If the first filename
listed does not exist the server will look for the next and so on. So you might
have:
DirectoryIndex index.html index.htm honeybee.html margarine.html
(Again, the above should all be on a
single line)
Magic Trick No. 3: Allow/Prevent Directory Browsing
Most servers are configured so that directory browsing is not allowed, that
is if people enter the URL to a directory that does not contain an index file
they will not see the contents of the directory but will instead get an error
message. If your site is not configured this way you can prevent directory
browsing by adding this simple line to your .htaccess file:
IndexIgnore */*
But there may be times when you want to allow browsing, perhaps to allow
access to files for downloading or for whatever reason, on a server configured
not to allow it. You can override the servers settings with this line:
Options +Indexes
Easy!
Magic Trick No. 4: Allow SSI in .html files
Most servers will only parse files ending in .shtml for Server Side Includes.
You may not wish to use this extension, or you may wish to retain the .htm or
.html extension used by files prior to your changing the site and using SSI for
the first time.
Add the following to your
.htaccess file:
AddType text/html .html
AddHandler server-parsed .html
AddHandler server-parsed .htm
You can add both extensions or just one.
Remember though that files which must be parsed by the server before
being displayed will load more slowly that standard pages. If you change things
as above, the server will parse all .html and .htm pages, even those that do not
contain any includes. This can significantly, and unnecessarily, slow down the
loading of pages without includes.
Magic Trick No 5: Keep Unwanted Users Out
You can ban users by IP address or even ban an entire range of IP addresses.
This is pretty drastic action, but if you don't want them, it can be done very
easily.
Add the following lines:
order allow,deny
deny from 123.456.78.90
deny from 123.456.78
deny from .aol.com
allow from all
The second line bans the IP address 123.456.78.90,
the third line bans everyone in the range 123.456.78.1
to 123.456.78.999 and so is much more drastic. The
fourth line bans everyone from AOL. A somewhat
excessive display of power perhaps!
One thing to bear in mind here it that banned users
will get a 403 error - "You do not have
permission to access this site", which is fine
unless you have configured a custom error for this page
which in fact appears to let them in. So bear that in
mind and if you are banning users for whatever reason
make sure your 403 error message is a dead end.
Magic Trick No. 6: Prevent Linking to Your Images
The greatest and most irritating bandwidth leech is having someone link to
images on your site. You can foil such thieves very easily with .htaccess. Copy
the following into your .htaccess file:
RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?domain.com/.*$ [NC]
RewriteRule \.(gif|jpg)$ - [F]
You don't need to understand any of that! Just change 'domain.com'
to the name of your domain.
(Again each command should be on a single line. There are 4 lines above,
each starting with 'Rewrite')
If you want to really let them know they have been rumbled why not make an
image like the one below (or take this one if you like)

call it stealing.gif, save it to your images file and
add the following line after the code above:
RewriteRule \.(gif|jpg)$ http://www.domainname.com/images/stealing.gif
[R,L]
(The above command should be on a single line)
Magic Trick No 7: Stop the Email Collectors
While you positively want to encourage robot visitors from the search
engines, there are other less benevolent robots you would prefer stayed away. Chief
among these are those nasty 'bots that crawl around the web sucking email
addresses from web pages and adding them to spam mail lists.
RewriteCond %{HTTP_USER_AGENT} Wget [OR]
RewriteCond %{HTTP_USER_AGENT} CherryPickerSE [OR]
RewriteCond %{HTTP_USER_AGENT} CherryPickerElite [OR]
RewriteCond %{HTTP_USER_AGENT} EmailCollector [OR]
RewriteCond %{HTTP_USER_AGENT} EmailSiphon [OR]
RewriteCond %{HTTP_USER_AGENT} EmailWolf [OR]
RewriteCond %{HTTP_USER_AGENT} ExtractorPro
RewriteRule ^.*$ X.html [L]
Note that at the end of each line for a named robot there appears an '[OR]' -
don't forget to include that if you add any others to this list.
This is by no means foolproof. Many of these sniffers do not identify
themselves and it is almost impossible to create an exhaustive list of those
that do. It's worth a try though if it even keeps some away. The above as as
many as I could find. ....and
Finally
There is one very important area of the .htaccess file's use
that we have not really mentioned and that is its use for user authentication.
It is perfectly possible to configure your .htaccess files by hand to control
access to directories on your site, but this is rarely necessary.
In most cases your
host will provide a method to allow you to much more easily configure the file
from your hosting control panel and there are a myriad of Perl scripts that will
allow you to set up full user management systems by
harnessing the power of .htaccess.
If you do want to go it alone there is a tutorial
here that will get you there: http://www.apacheweek.com/features/userauth
If you are looking for scripts there a many here:
http://www.hotscripts.com/CGI_and_Perl/Scripts_and_Programs/Password_Protection/index.html
Two scripts that I have used and can recommend are:
1. Locked Area
The free version will be adequate for many situations,
though both versions will give you control over access
to one directory and its contents only.
http://www.locked-area.com/html/
2. Password Manager
Allows you very sophisticated control over access to
multiple directories. Not cheap but very good value for
money.
http://www.cgi-world.com/password_manager.html
Have fun!
<< Part I - .htaccess Magic!
Katherine Nolan
OutFront Moderator
http://www.inkkdesign.com/ |