[Home | Contact | What's New? | Products | Services | Tips | Mike |
Living with Schizoaffective Disorder

Please to Forgive

This site totally sucks when viewed on a smartphone.
I'll fix this Real Soon Now.

Stop Hotlinking and Bandwidth Theft With .htaccess Files

Image galleries may find competitors stealing not only their content, but costly resources such data transfer. Block this theft with Apache .htaccess files.

Michael David Crawford

Michael David Crawford
Dulcinea Technologies Corporation
mike@dulcineatech.com
June 14, 2012 - First Draft

Copyright © 2012 Michael David Crawford. All Rights Reserved.

I Have Not Yet Begun to Write!

However, I have some quick tips for you:

If you control the entire webserver, rather than use a shared hosting service, there are many good reasons to disable the use of .htaccess files altogether. If you have the option of configuring your entire site's main configuration file - either httpd.conf or a Virtual Host configuration file - place the directives I will be recommending in that main config file rather than in an .htaccess file.

Simply enabling .htaccess files - also known as per-directory configuration files - makes your entire server run slower, even if you don't actually have any such files on your server. That's because the httpd process has to use the stat(2) system call to check for the existence of an .htaccess file whenever it searches a directory.

There are many ways that .htaccess files can open your entire site up to security holes. Diligent webmasters and SysAdmins can keep a lid on such problems, but that means extra labor for your staff and extra cost for your company.

If you use one of the inexpensive shared hosting services - notably, those that provide the "cPanel" Graphical User Interface that webmasters and web designers can use to control the configuration of their sites - then you have no choice but to use .htaccess. If your host provides cPanel or some other GUI, quite likely you can use the GUI to configure the Bandwidth Theft-Blocking Directives. If not, you can create the required .htaccess on your personal computer then upload it to the appropriate directory - or folder - at your host.

How to Find Out that Theft is Occuring

You might well get the bad news for the very first time when your host charges you an unexpectedly large amount of money for transferring far more outbound data than your monthly quota. For extra credit, they might shut your site down entirely because those who are stealing your content by "HotLinking" it are consuming so many server resources as to interfere with other users of your same shared host.

Thus it is very, very important to detect this problem early!

The way you discover such HotLinking is by analyzing the web server log files that all web hosts provide to their users. There are many ways to analyze your log files; I recommend the use of Analog . It is Open Source, and therefore free of charge, it is of very high quality, reliable and robust, and it processes even very large log files very, very efficiently, and so is readily able to provide frequent updates on the traffic even for very, very popular websites.

While Analog is quite easy to configure and use, its documentation was written with experts in mind. My own experience was that I required years of study and experimentation before I was really able to get Analog to serve my needs.

That was not at all because Analog is a poor quality product or is in reality hard to use. It is because its manual was written with the assumption that the reader already knew a great deal about web servers, log files as well as the information they were seeking through log analysis. The "Clueless Newbie" - such as, at one time, myself - is totally lost in Analog's exceedingly technical online manual.

To aid less technical folk in getting started with Analog, as well as developing powerful, custom analysis solutions, I am working on a tutorial:

The items to check in your Analog report are the total data transfer, given in the top section, the Request Report, which will show the number of HTTP requests for each individual URL on your site, then, if you find so much as a single document getting an unusually large number of requests, the Referring Page Report, which will provide the URLs of the documents that have HotLinked your images or videos.

How to Stop the Bandwidth Theft

The short answer is to configure Apache's mod_rewrite to alter the requested URL so that some other document is provided. The usual solution is to provide an alternative image that informs the end-user that the requested image has been blocked due to a violation of HotLinking policy. You can even provide an Error 404: Page Not Found result if you like.

If the data transfer consumption is not too costly, an alternative solution that I myself have had great success with is to continue serving the requested images, but provide alternative images that explicitly credit, and give the URL to the site from which they were stolen.

That will have the happy result of attracting new visitors from the competing sites that steal your content, until their own webmasters get around to inspecting the pages containing your HotLinked images.

Back To You Real Soon Now With The Full Details

Vote for Us at the Programming Pages

Voting for us at The Programming Pages will encourage more people to read these articles.

[Home | Contact | What's New? | Products | Services | Tips | Mike]