Monday, 25 July 2011

Transparent Virus Scanning, Ad-blocking Proxy with Ubuntu + Squid3 + Netgear DG834G

I know this is supposed to be a law firm, but let me indulge in my alter-geekness.
The title is fairly self-explanatory, but let me paint a little ASCII picture for you as to how my firm network is now set up:

INTERNET <==> DG834G ROUTER (192.168.0.1) <==> [HAVP, SQUID3, UBUNTU (192.168.0.3)] <==> USERS

WHAT THIS MEANS for "I just want to browse teh interwebs" people: your web browsing gets automatically virus-scanned, ad-blocked, and accelerated, (and cached, and logged).


Phase 1 - Set up Squid3
  • Find a machine and set up Ubuntu.
  • Install Squid 3 by entering "sudo apt-get install squid3" into a terminal.
  • Follow this blog to enable other computers to use your new squid proxy server.
  • Reload squid (important to do this after you alter your squid.conf file):
sudo /etc/init.d/squid3 reload
  • Test. Change a PC's browser settings to use your ubuntu box as a proxy. The default squid port is 3128.
Phase 2 - Extend Squid with Ad-blocking
  • After much painful playing around, I found Squid's internal access control list ("ACL") engine is the best for blocking ads.
  • Two lines in your squid configuration file ('/etc/squid3/squid.conf') will do it: 
acl ads dstdom_regex -i "/etc/squid.adservers"
http_access deny ads
  • You can populate the file squid.adservers with this output in regular expression ("regex") format.
  • In future you will want to use a script to keep the blacklist of ad servers updated. Save this script to a file, say, 'update-squid-adservers.sh', edit it to suit, then change its ownership and permissions so that it is executable:
sudo chown root:root update-squid-adservers.sh
sudo chmod u+x update-squid-adservers.sh
  • You can create a cron job to run the script perhaps once a week. I use webmin to manage my scheduled tasks. Webmin also has a very nice module for monitoring and configuring squid proxy.
  • Test. With websites like engadget.com and slashdot.com, you will notice a greater concentration of  content in the absence of ads. YouTube.com will display a big fat 'Error' in the middle of the screen as a stonking great big ad is denied. The rest of the site will work fine, though. You will never want to stop using your proxy after this phase. But wait, there's more.
Phase 3 - Extend Squid with Anti-Virus
  • Install havp on Ubuntu:
sudo apt-get install havp
  • I use havp as squid's parent proxy - a proxy before a proxy - because I want files to be scanned before they even reach squid.
  • Add these lines to your squid.conf:
acl all src 0.0.0.0/0.0.0.0
cache_peer 127.0.0.1 parent 8080 0 no-query no digest no-netdb-exchange default
cache_peer_access 127.0.0.1 allow all
acl Scan_HTTP proto HTTP
never_direct allow Scan_HTTP
Phase 4 - Squid Transparency and HTTP redirection
  • The above is so wonderful that all users on your network should use it. In fact, this phase will force them to do so.
  • In Phase 1, you have configured your squid proxy to accept transparent requests.
  • HTTP request interception and redirection is handled by iptables commands. (A good iptables reference lies here.)
  • Most how-to articles assume the squid proxy IS the router. Not so, in my situation.
  • Most how-to articles also assume that I want ALL port 80 requests being redirected to the proxy. This would preclude using the proxy computer as a web server, or an application server with a web front-end. Unsatisfactory. Routing should be done by a router.
  • My router at address 192.168.0.1 is a Netgear DG834G. I can manually enter iptables commands if I activate debug mode by browsing to:
http://192.168.0.1/setup.cgi?todo=debug
  • Using a telnet program like putty, I can 'telnet' into the router on port 23.
  • I had, in the router web interface, already allowed all HTTP requests from the proxy. However, this iptables rule will make doubly sure:
iptables -t nat -A PREROUTING -p tcp -s 192.168.0.3 --destination-port 80 -j RETURN
  • This iptables rule will intercept all TCP port 80 requests destined for beyond the local network, and redirect them to the proxy. (While at the same time, leaving intranet requests on port 80 alone.) Unfortunately, there is no way to use the web front end to do this, and command-line iptables must be used.
iptables -t nat -A PREROUTING -p tcp -d ! 192.168.0.0/24 --dport 80 -j DNAT --to 192.168.0.3:3128
  • Test by setting browser to 'no proxy' and surfing to an ad-filled website.
  • One of the peculiarities of this method is that the manually entered iptables rules will disappear if the router is rebooted. That suits me fine, as it is thus easy to reverse mistakes. Even if redirection to the proxy is ceased, users will still be able to access the internet until the rules are re-established.
  • I could have achieved transparency by using wpad proxy auto-configuration files. But that would mean transferring dhcp server function from router to proxy computer and a different set of headaches, all the while stopping Android devices like my shiny Samsung Galaxy S.
Congratulations! You have just made web browsing from your home network safer, faster, less bandwidth-intensive, and more enjoyable.

Please share or comment.

3 comments:

  1. Squidblacklist.org is the worlds leading publisher of native acl blacklists tailored specifically for Squid proxy, and alternative formats for all major third party plugins as well as many other filtering platforms. Including SquidGuard, DansGuardian, and ufDBGuard, as well as pfSense and more.

    There is room for better blacklists, we intend to fill that gap.


    It would be our pleasure to serve you.

    Signed,

    Benjamin E. Nichols
    http://www.squidblacklist.org

    ReplyDelete
  2. Ads blacklist --> http://www.squidblacklist.org/downloads/squidblacklists/squid-ads.tar.gz

    ReplyDelete
  3. That's mad. I ain't doing all that! That's fucked up!

    ReplyDelete