http://Check-these.info/AntiLeech.html


What is "Leeching"

When other site utilize resources of your site (images, etc)
in their pages, it's often called "Leeching".

It's considered immoral, and hated by many people who spent lots of
energy and time to create these resources.

It could be illegal, too.

Technically speaking, Leeching is the same situation as Direct linking.
In other words, the request to the resources happens without the request
for the pages that utilize them.

It can be detected by the lack of corresponding HTTP_REFERER env variable,
access history, or other methods.

Is it a big problem?

For people who care a lot about their creation, it is a very sensitive issue.
It's a stealing for them. Maybe it's a copyright violation, too.

In terms of added traffic and bandwidth, it can cause additional fee
or site shut down depending on the Host/services you are using.

If you don't care about other people using your materials,
and if you don't care about added traffic/bandwidth,
then it's not a big problem, and you can just forget it.

Warning

Unless the leeching IS really a problem for you, I recommend you to stay away
from Anti-Leech measures.

Common Anti-leech measures aren't 100% effective, and it can affect the way your site works.

Badly crafted Anti-leech code will simple break your site.

Again, don't use it unless you really need it.

Common Anti-Leech measures

There are two common Anti-Leech methods (for Apache web server).
  1. Using SetEnvIf and Deny directive.
  2. Using mod_rewrite (RewriteRule)

To use these methods, you must be able to create/edit "httpd.conf" or ".htaccess",
and respective directives should be enabled.

In shared hosting senario, they goes to a file named .htaccess placed in the document root.

Both methods checks HTTP_REFERER and redirect (internally or externally)
to file/URL if it's considered "Leeching".

Example using SetEnvIf:
SetEnvIfNoCase Referer "^http://(www\.)?mysite.com/" locally_linked=1
SetEnvIfNoCase Referer "^$" locally_linked=1
<FilesMatch "\.(jpg|mov|png|gif)$">
  Order Allow,Deny
  Allow from env=locally_linked
</FilesMatch>

Example using RewriteRule:
RewriteEngin On
RewriteBase /
RewriteCond %{HTTP_REFERER} !^http://(www\.)?mysite.com/ [NC]
RewriteCond %{HTTP_REFERER} !^$
RewriteRule \.(jpg|mov|png|gif)$ - [F,NC]

Variation using RewriteRule:
This send blank txt instead of redirecting to "Forbidden error" page.
It uses less bandwiths, less server resources, and possibly less hit.
You need to create a file named blank.txt of zero length.
RewriteEngin On
RewriteBase /
RewriteCond %{HTTP_REFERER} !^http://(www\.)?mysite.com/ [NC]
RewriteCond %{HTTP_REFERER} !^$
RewriteRule \.(jpg|mov|png|gif)$ blank.txt [L,NC]

Note:

There are infinite number of variations depnding on the structure of your site,
the location and/or naming of resources. and other tactics/requirements.

If you use RewriteRule method, you should be very careful about
possible interaction among different code.

RewriteRule is very touchy sensitive matter, and it can cause problems
if you don't handle it with care.

Limitation

As these methods relies on HTTP_REFERER, it doesn't work if the browser
or robot sends bogus or no HTTP_REFERER.
(About 10 to 20% of browser doesn't send HTTP_REFERER.)

So, these methods are not 100% effective.

It will work relatively well if you just want to decrease the leeching traffic.

Alternatives

When you want more control, you have to think about alternative methods.

Example: (I just wrote it on the fly, and not tested)
  1. Put all resources to be protected in non-public, password protected directory.
    /htdocs/__hidden__/whatever.jpg
    /htdocs/__hidden__/not_so_public.mov
    ...
    
  2. Create the directory where you put symlinks and shown as the directory for these files.
    /htdocs/resources/
    
  3. Make a list of files that contain link(s) to these resources as a plain text file, one file/ line, in this case in absolute path, name it as flist.txt and place it above htdocs.
    /www/U/USER/htdocs/index.php
    /www/U/USER/htdocs/forum/some.html
    ...
    
  4. Create a crontab entry.
    28 */4 * * * etc/rotateln.cgi >>etc/rotateln.log 2>&1
    
  5. Put rotateln.cgi in etc and set 700.
    
    #!/bin/sh
    # store the unix time (in seconds in variable D
    D=`date "+%s"`
    
    # cur out first 4 digits
    D=${D#????}
    
    #list up all files in /htdocs/__hidden__ 
    for f in htdocs/__hidden__  
    do 
      # create symlink for eash file in /htdocs/resources/ with prefix of variable D + _
      ln -s $f  $HOME/htdocs/resources/${D}_${f##*/}
    done
    
    # delete old symlinks
    find $HOME/htdocs/resources  -mtime +8h -delete
    
    # replace all links with new prefix
    perl -i -pe "s#(resources/)\\d+_#\$1${D}_#g;" < $HOME/flist.txt
    

The details (the frequescy of the rotation, the name of resource directory,
and prefix string) can be changed as you wish.

Leeching traffic cannot get it because they don't know current password, unlike people who visited your page.
This can be done with one crontab entry and one script.

Of three methods I mentioned, the first one is the most transparant for the user.
The last one will be unpopular among those who hate coockies.
2nd one is a little bothersome because he users have to type in password.

As far as overhead is concerned, the first one will become heavy if there are lots of resources
and/or pages linking to the resources. (More than a hundred ... maybe a bit more)
2nd one increases the hit count a little, but the number of resources and pages linking to them don't matter.
3rd one must use a script (CGI/PHP) for the linking pages, and it may impact the speed and stability.
(Tiny Shellscript will do.)

I'd imagine there are more ways to do.

Questionable color of this page is dictated by blueberry cream cake, my favorite dessert.

This page is http://Check-these.info/AntiLeech.html