At the ISWC Privon workshop in October, Neel Guha talked about his Spy Watch Google Chrome extension that keeps track of the third parties tracking the web pages you visit. Unlike Ghostery, it only collects information and can not block tracking sites, but it logs more information about how your Web behavior is being observed and gives good insight into the nature and scope of the Web tracking phenomenon.
When you view a page like www.nytimes.com you expect it to know that you visited the site. It may even know personal information (e.g., name, address, age, sex) if ever divulged it to the site, perhaps when setting up an account. Spy Watch reports that my recent visit to the NYT site was also observed by 24 other sites, including doubleclick.com, brightcove.com, googleapis.com and sothebysrealty.com. And this is with an ad blocker enabled — 28 third parties observed me when I disable it.
Each of these third parties also knows the page on the NYT site I just visited. But I don’t have an account on most of them, so they don’t know who I really am, right? Well, some can easily discover my identity. Doubleclick, for example, knows I just read that Times article on how to cook a duck and, since it’s part of Google, can potentially integrate the information with all of the other information Google has about me.
Not all of the third party sites identified by Spy Watch are tracking us. Sothebysrealty.com, for example, showed up on my visit to the Times because they provided some content (an image) on the page. Checking my Spy Watch data shows that Sothebysrealty has seen me on just two pages (both on the NYT site) whereas Doubleclick has seen me on 1266 pages across 260 sites. Clearly Sothebysrealty is not a tracking company and doubleclick is. Such third party tracking is done via an array of techniques that include using cookies, free analytic services, tags, web bugs, single-pixel images, javascript tags and web beacons.
I’ve been running Spy Watch for about two months and it reports that 1533 third party sites have (potentially) collected data about the 12,000 distinct URLs I’ve visited during this time. It also notes that, on average, every page I’ve visit has been watched by 3.7 third parties. As you might expect, the distribution follows a power law with a long tail of sites that only observed a few of my visits (about 2/3 of them saw three or fewer). Here are the top twenty third party trackers in my two month’s of data.
Note that Google (red), Facebook (dark blue) and Twitter (green) are the three companies who potentially know the most about what you do on the Web.
Spy Watch can also show how many and which pages have been observed by a tracker. Facebook observed me viewing 2208 pages across 509 sites (via FB like and visit buttons) and now knows that I read reviews for Sharp and LG microwave ovens on toptenreviews.com earlier this month and frequently visit the cra.org site.
You can get and install Spy Watch from the Google Web store, which describes it like this.
Spy Watch is a privacy extension that aims to create transparency in online internet tracking by third party sites. When a user visits a page, Spy Watch lets the user see every site that knows the user visited that page. And for each of these sites, the user can find out what other information the site has gathered about the user’s browsing history. After you install the extension, continue to browse normally. After some time, click on the extension to see who’s watching you! Disclaimer: User data is stored in the browser and is not accessible by the creator of this extension.