This site is now 100% read-only, and retired.

Squid site restrictions

Posted by defsdoor on Mon 22 May 2006 at 15:46

In the office I needed a way to block some websites permanently and others outside of break times. After looking at some inline solutions I realised that I could easily do what was needed with squid alone. Here's how

I created the following ACLs in squid's config file :

acl blockedsites url_regex -i "/etc/squid/blocked.txt"
acl bannedsites url_regex -i "/etc/squid/banned.txt"

acl lunchtime time MTWHF 12:15-13:45

The I can apply these ACLs near the end of my squid ACL rules:

http_access allow managers
http_access deny blockedsites !lunchtime
http_access deny bannedsites
http_access allow domainusers
http_access deny all

I use squid authentication here - the managers ACL refers to special users that have no restrictions. Making sure this is before the restrictive ACLs means it is applied and matched first. The domainusers ACL refers to any authorized users - unauthorized users are denied all access.

So, you can see that the access is denied to both ACLs, and the blockedsites ACL has an exception of !lunchtime. This means deny access while its not lunchtime - ACLs applied on the same line are logically ANDed.

The entries in the /etc/squid/blocked.txt and /etc/squid/banned.txt files are simple:

ebay
planetfootball.com
bigbrother.channel4.com

These are url_regex and because I keep them simple like this, the occurrence of, for example, ebay anywhere in the URL will match and therefore be denied.

When a new entry is added to either of the files it's a simple matter of "/etc/init.d/squid reload" to force squid to see the changes.

 

 


Re: Squid site restrictions
Posted by Anonymous (59.37.xx.xx) on Tue 23 May 2006 at 04:47
What entries should I put in the /etc/squid/blocked.txt and /etc/squid/banned.txt files if I want to allow access to www.google.com but restrict access to news.google.com?

[ Parent ]

Re: Squid site restrictions
Posted by defsdoor (83.105.xx.xx) on Tue 23 May 2006 at 08:51
Just add news.google.com to the blocked or banned lists. Anything that contains news.google.com anywhere in the requested URL will be denied.

[ Parent ]

Re: Squid site restrictions
Posted by Anonymous (59.95.xx.xx) on Wed 7 Jun 2006 at 06:29
what entries will be made in what type of file for restricted www.google.com inboth files

[ Parent ]

Squid site restrictions BASIC
Posted by deviant2 (110.138.xx.xx) on Sun 29 May 2011 at 19:04
## (1) does configuration written correctly ? (assume space as newline in txtfile)
#
# i want to block ALL SUB DOMAIN google.com like :
acl http_access_block__me ^http://.*\.google\.com/.
http_access deny http_access_block__me  
# BUT allow "news.google.com"
acl http_access_allow__me ^http://.*\.news\.google\.com/. ^http://news\.google\.com/.
^http://.news\.google\.com/.
http_access allow http_access_allow__me  
#
# or i should swap it (allow 1st then block) ?

##
## (2)(a) i want to block ALL ANY sub/main domain which contain "ad"
# like "ad.example.com"
# (b)(NOT in URL path like "example.com/ad.php" or "example.com/add/?user=99999" or "whoisdomain/?q=www.ad.com" )
# (c)BUT allow "download.com" or "download.example.com/?q=file.zip" or "download.example.com/ad.php"
#
# i use ".ad." or ".ad\." -> B, C is restricted
# i use ".\.ad\." -> B is restricted
# i dont understand about (number 1) who is first ? but allow the 2nd, 3rd, etc acl
#
# (D) including ...
# i want to block all port in "subA.example.com" (like http) but allow ftp

# so the big point is how to add exception in squid ?
# do you know what should i do ?

[ Parent ]

Re: Squid site restrictions
Posted by sphaero (62.177.xx.xx) on Tue 23 May 2006 at 07:16
[ View Weblogs ]
Another way to accomplish this is by using squidguard, dansguardian or squirm. Doens anyone know how this solution performs compared to for example squidguard or squirm. I reckon when using a large blocklist this solution is not recommended?

[ Parent ]

Posted by defsdoor (83.105.xx.xx) on Tue 23 May 2006 at 09:05
That purely depends on how each solution manages the lists. Ultimately each has to compare the requested URL against every listed regex though so I would guess theres not going to be much between them.

As these are regex entries you can speed things up - I felt that to show that would be too complicated for a simple guide like this.

For example -

^http://www\.google\.com 
would block URLs beginning with www.google.com - note that the http:// is needed. This would be far quicker than pure substring searches as the comparison will fail quicker. Also note that in this example I have correctly quoted the '.'s as they actually mean 'any character' ordinarily.

I intend to eventually have a web interface to maintain these lists and I will then generate the regexes as the user will not have to worry about them.

The following script detects changes to the lists and reloads squid automatically - I run this on a cron job.

#!/bin/ksh
#

FILELIST="blocked.txt banned.txt noauth.txt"

REFFILE=/etc/squid/.reload

RESTART=N

for FN in $FILELIST
do
        [[ /etc/squid/$FN -nt $REFFILE ]] && RESTART=Y
done

if [[ $RESTART = "Y" ]]
then
        touch $REFFILE
        /etc/init.d/squid reload
fi

This also monitors a additional file called "noauth.txt" which I use to list sites for which access is allowed before authentication - such as Anti-Virus updates sites or Windows Update.

[ Parent ]

Re: Squid site restrictions
Posted by Anonymous (196.36.xx.xx) on Tue 23 May 2006 at 07:49
Hi

Great article, therefore I thought id add a few ACL I use and have come across.

To block certain mime types etc
===============================================================
## Mime Blocking ## BLOCKING requested mime types
acl mimeblockq req_mime_type -i ^application/x-icq$
acl mimeblockq req_mime_type -i ^application/x-comet-log$
acl mimeblockq req_mime_type -i ^application/x-pncmd$
acl mimeblockq req_mime_type -i ^application/x-hotbar-xip20$
acl mimeblockq req_mime_type -i ^.AIM.
acl mimeblockq req_mime_type -i ^application/octet-stream$
acl mimeblockq req_mime_type -i application/octet-stream
acl mimeblockq req_mime_type -i ^application/x-mplayer2$
acl mimeblockq req_mime_type -i application/x-mplayer2
acl mimeblockq req_mime_type -i ^application/x-oleobject$
acl mimeblockq req_mime_type -i application/x-oleobject
acl mimeblockq req_mime_type -i application/x-pncmd
acl mimeblockq req_mime_type -i ^video/x-ms-asf$

acl mimeblockp rep_mime_type -i ^application/x-mplayer2$
acl mimeblockp rep_mime_type -i application/x-mplayer2
acl mimeblockp rep_mime_type -i ^application/x-oleobject$
acl mimeblockp rep_mime_type -i application/x-oleobject
acl mimeblockp rep_mime_type -i application/x-pncmd
acl mimeblockp rep_mime_type -i ^video/x-ms-asf$
acl mimeblockp rep_mime_type -i ^application/x-icq$
acl mimeblockp rep_mime_type -i ^.AIM.
acl mimeblockp rep_mime_type -i ^.*AIM/HTTP
acl mimeblockp rep_mime_type -i ^application/x-comet-log$
acl mimeblockp rep_mime_type -i ^application/x-pncmd$
acl mimeblockp rep_mime_type -i ^application/x-chaincast$
acl mimeblockp rep_mime_type -i ^application/x-hotbar-xip20$

http_access deny mimeblockq
http_reply_access deny mimeblockp

http_access deny mimeblockq
http_reply_access deny mimeblockp

===============================================================

## Stop multimedia downloads - hence audio streaming.
acl useragent browser -i ^.NSPlayer.
acl useragent browser -i ^.player.
acl useragent browser -i ^.Windows-Media-Player.
acl useragentq rep_mime_type ^.video.
acl useragentq rep_mime_type ^.audio.
http_access deny useragent
http_access deny useragentq

HTH

Brent Clark

[ Parent ]

Re: Squid site restrictions
Posted by Anonymous (81.3.xx.xx) on Tue 23 May 2006 at 17:43
Have a look at an open source project called CensorNet which deploys Squid and DansGuardian. You can find more information about at http://www.censornet.com/

They're currently working on v4 although I don't have any details on this version. There might be more information on their forums about the changes and new features etc.

[ Parent ]

Re: Squid site restrictions
Posted by daemon (198.54.xx.xx) on Tue 23 May 2006 at 23:39
[ View Weblogs ]
Wondering why you didn't rather use a dstdomain acl instead of a regex based one? Sure, you loose the (slight in this case) benefit of matching patterns rather than static strings, but the processing overhead has got to be better for dstdomain's than regex's...

Just wondering really...

Cheers.

[ Parent ]

Re: Squid site restrictions
Posted by Anonymous (83.105.xx.xx) on Wed 24 May 2006 at 08:33
I did - but theres so many sites now that mangle the hostname. I think dstdomain is a complete match - not subdomain - so blocking ebay, for example, would require dozens of entries.

[ Parent ]

Re: Squid site restrictions
Posted by daemon (146.231.xx.xx) on Wed 24 May 2006 at 15:17
[ View Weblogs ]
As long as the you can be sure of the LCDFQDN (lets see, "Lowest Common Denominated Fully Qualified Domain Name" phew ;-), then you can use dstdomain, but you have to remember that the full-stop (or "period" to those on the wrong^H^H^H^H^Hother side of the pond) is significant, so if you wanted to match "ebay.com", then you can enter "ebay.com" in as the domain-name. However, if you wanted to match all sub-domain's of ebay, you could use ".ebay.com" which basically means "search for *.ebay.com". So it does match sub-domains.

If you want even more power, you could look at using the "dstdom_regex" which does more sophisticated pattern matching, with the expected extra overhead.

The important point though, is that these acl's only match against the domain name portion of an URL, not the whole URL, which could save alot of processing power for sites like ebay that often have lengthy URLs with many quoted query arguments.

Just a reminder to all readers, to save a "huh why's it not working" moment like I had the other day, when you specify a file that squid should check, you need to put it in quotes ("), otherwise it'll use the filename as a literal string to check. D'oh! There's no way that my users would have been able to browse to /etc/squid/domains.deny, but they could still get to all the "banned" sites...

Cheers.

[ Parent ]

Re: Squid site restrictions
Posted by Anonymous (195.76.xx.xx) on Thu 3 Aug 2006 at 18:45
If you only match against the domain name portion of an URL the clever boys can bypass your squid filter by using some allowed domain as a proxy (visit http://www.oreillynet.com/pub/h/4807 to see an example of how to use google for this purpose)

[ Parent ]

Re: Squid site restrictions
Posted by Anonymous (203.81.xx.xx) on Thu 25 May 2006 at 06:51
Excellent work, I wish I will see some more squid ACL examples from you in the future.

[ Parent ]