Search This Blog

Wednesday 27 July 2011

Scroogle: Adding privacy to Google Search


Takeaway: Google Search is an amazing tool. Even so, to many, it has a dark side. Scroogle may be able to help.
Over the years, I’ve witnessed–from a safe distance–highly-charged debates about search behemoths like Google. The topic most often discussed is whether or not they retain too much Personally Identifiable Information (PII) for too long. Valuable lessons surfaced from those frank discussions, many important enough for me to write about.
Another place where I have gleaned similar information has been in the comment sections of the articles I just mentioned. One example is my introduction to Scroogle.
My first impression was: What an odd name. I didn’t think much more of it. Then a colleague gave his middle-finger explanation of the term. “Oh,” was all naive me could say, “You really think so?”

Scroogle, what is it?

Now I had to find out about Scroogle. First thing that caught my eye:
“Every day Scroogle crumbles 350,000 cookies and blocks a million ads.”
Next thing I noticed, Scroogle does not:
  • Pass cookies on.
  • Keep search-term records.
  • Retain access logs for more than 48 hours.
The website calls Scroogle a scraper. Being from Minnesota, I have this image of a scraper and it is not Scroogle.
Actually, after some study, referring to it as a scraper does make sense. The pertinent search results are “scraped” from Google’s response to the search query. And only that information, no cookies or additional requests, get back to the client’s web browser.
The following slide depicts the steps involved (courtesy of Scroogle):

Behind the scene

The process is simple. You enter your search request in the web browser, like normal. It is sent to Scroogle via a SSL connection — more on that later. Scroogle replaces all your identifying information with that of Scroogle. The search request is forwarded to Google. Google records the IP address and search information issued by Scroogle.
Google then replies with a cookie and the search results. Scroogle sanitizes the data, sending only the search results back to you. Below are the search results for ice scraper using Google:
Next are the results using Scroogle:

Scroogle, the plugin

The website calls Scroogle a browser plugin. Simple enough to implement, but I’d like to expand on the minimal help offered by the website:
  • Firefox: This link is to the Firefox add-on. All that is required is to click on the Add-on button.
  • Internet Explorer: Microsoft set up Internet Explorer to ask for the desired search engine. Details are at this link. All that is required is to enter http://www.scroogle.org/cgi-bin/nbbw.cgi?Gw=TEST where it asks.
  • Opera: Click on the following: Tools/Preferences/Search/Add. Pick a new keyword “example” and use http://www.scroogle.org/cgi-bin/nbbw.cgi?Gw=%s as the address.
  • Chrome: Click on Wrench/Options/Default Search Manage/Add. Then paste https://ssl.scroogle.org/cgi-bin/nbbwssl.cgi?Gw=%s where an URL is requested.
If you prefer not to alter the current configuration of your web browser, or are using a computer other than your own, Scroogle has a webpage similar to Google, where you can enter search terms.

Back to SSL

The Scroogle website points out why the creators decided to use SSL connections:
“For Scroogle, SSL is used to hide your search terms from anyone who might be monitoring traffic between your browser and Scroogle’s servers. This encryption happens when you send your search terms to Scroogle, and it also happens when Scroogle sends the results of your search back to you.”
The SSL webpage points out another advantage that I was not aware of:
“When the Scroogle results come back from an SSL search, and you click on any of the links shown on that secure page, there is another advantage. SSL does not allow the browser to record the address where that secure page came from and attaches it to any outgoing non-SSL links on that page. Normally all browsers do this and it’s called the “referrer” address.
Using SSL blanks out this referrer, so that any non-SSL site you click on from a Scroogle SSL page won’t know that you arrived at their site from Scroogle. The referrer will be blank, and your log entry at that site will look like any of the hundreds of bots that crawl the web all day and night with similar blank referrers.”
I did not know that until now.
That said, do not let the use of SSL connections lure you into a false sense of security. SSL may or may not be in play after you click on one of the returned search links. It depends on whether the web server advertised in the link is using SSL or not.

Both use SSL

Google also has the option to use SSL. And, Google makes the same claim on how encryption prevents third parties from intercepting transmissions between the user’s computer and Google Search web servers.
My immediate thought: It would be cool if the Scroogle servers talking to Google Search would use their SSL connection. I shot off an email to Scroogle and Daniel Brandt, Founder and President of Scroogle, offered this:
“No, the connection between my servers and Google does not use SSL.
There are two reasons for this:
  • The search terms for that hop are carried by the IP address of my server, and the only way they can be associated with the searcher’s IP address would be if someone hacked into my dedicated servers and read my logs. And they’d have to be quick about it, because I don’t keep any logs longer than 48 hours. I’m the only one with access to my servers.
  • I do not use DNS to do a lookup of www.google.com. Instead, I randomly select one of their static IP addresses for www.google.com (they have thousands). As you may know, https initiation requires a handshake that certifies that the domain name belongs to the IP address. Since I’m not using “www.google.com” at all, I cannot initiate an https session with Google.”
That makes sense to me. Thank you for clearing that up, Daniel.

Quality of SSL connection

I just happen to be researching a new Comodo website, SSL Analyzer. It is a free web-based scanning tool that checks the security of a web server providing SSL connections.
Included in the summary is information about the certificate and digital signature. Also included, is a list of security protocols and encryption suites supported by the web server.
SSL Analyzer uses the following designations to highlight problems:
  • Red: Problem that needs immediate attention.
  • Amber: Potential issue that needs evaluation.
With so much emphasis being placed on SSL connections, I thought, why not test them? Here are the results for Scroogle and the results for Google Search. You can see that both have issues. I am not sure I would consider them show-stoppers, but it is something to think about.

Bottom line

Now comes the hard part. After all is said and done, it ends up being a matter of trust. If using Google Search is important, but you are not sure about trusting Google, you may want to think about Scroogle.
Related Posts Plugin for WordPress, Blogger...