Topic: work
I had a problem at work today where someone complained that they could search google and see all the results but when they clicked on one they got an error message. It turned out that the error message was from the Squid Proxy and they did not have access to the site. This was puzzling for 2 reasons. First this person was not supposed to be filtered through the whitelist proxy. Secondly, why was the whitelist proxy allowing a google search?
The problem with the user being on the wrong Proxy was easily fixed. What I then dug into was trying to determine what was going on with the whitelist to allow a google search.
So first the setup…
It’s pretty simple I’m running Squid Proxy 2.6 on a Debian 4 (Etch). I know I’m a few releases behind here but hey the system “just works” and it is a custom install on a Cobalt RAQ3. I’m not about to break it by trying to upgrade when it’s not needed. Any way, Squid is running a whitelist using this line in the config:
acl whitelist url_regex "/etc/squid/whitelist"
http_access allow all whitelist
So I’m never allowing www.google.com/* which is what would allow me to open up searching. The whitelist does have a few references to google.com but they are explicit for certain sites like:
maps.google.com/*
khm3.google.com/*
gg.google.com/*
ssl.google-analytics.com/*
Of course I set myself on the Squid Proxy and tried to go to http://www.google.com, https://www.google.com, as well as removing the www. In every case I was greeted with the “Access Denied” page. I then added google as a search provider in IE9. I searched and low and behold it returned a google search page. Matt helped out by sniffing packets and watching the logs. We could see where the Squid was allowing the google search page through. So what was going on here?
I started to take apart the string that was being used for the search:
http://www.google.com/search?q=dogs&sourceid=ie7&rls=com.microsoft:en-us:IE-Address&ie=&oe=
I cut everything out one at a time and found that this was the minimum to get through:
http://www.google.com/search?q=pregnant+mermaids&rls=com.microsoft:en-us
I’m not sure what the &rls means. I did a little searching but didn’t come up with anything except that it’s used other places like the default Firefox home page. We do whitelist microsoft.com. I commented that out of the whitelist and reloaded Squid with the same results. Matt changed microsoft to something else and that worked as well. After a bit it hit me what was happening.
We allow all .gov sites which sometimes include .us sites so .us/* is listed in the whitelist. I commented this out and the site was blocked. Finally success, but I need to allow those .us sites and additionally what was Squid recognizing –us as .us and allowing a search to go through to google.com which was not allowed?
Well I didn’t really waste time researching this. I’ve found the online documentation for Squid lacking. I’m not sure the book they sell would have helped. I admit that it could be a version issue but again, I’m not upgrading. So my solution was to add a blacklist and explicitly block google. As mentioned I need some google sites so I can’t just blacklist google.com. I added www.google.com/search to a file called /etc/squid/blacklist and added these lines to the squid.conf:
acl blacklist url_regex "/etc/squid/blacklist"
http_access deny blacklist
After reloading Squid, I tested a search and was given the “Access Denied” page. All was right with the world again. I hope this info will help someone else dealing with a similar issue.
Updated: Friday, 10 June 2011 11:54 AM PDT
Post Comment | Permalink | Share This Post