Joris van Hoboken: Does privacy still exist in an environment of search?

“In a society of the query, it’s an interesting question to ask what happens to all those queries, what legal norms apply to the registration, processing and access to these queries, and do these norms successfully safeguard the more fundamental interests of search engine users: a free realm to seek and access information and ideas,” began Joris van Hoboken at The Society of The Query conference this afternoon.

Hoboken is a PhD candidate at the Institute for Information Law at the University of Amsterdam, writing his dissertation about search engines and freedom of expression. His research investigates the impact of legal norms on the users’ freedom, and today at The Society of the Query he focused on the question, “Does privacy still exist in an environment of search.”

Accessing Our Data From Search Engines
Hoboken rightfully admitted that most users have a lack of knowledge about data protection. Corporations behind popular search engines like Google, Yahoo and AOL are storing a plethora of user data (query logs, IP, time, cookies etc), and what most people don’t know is that EU law grants users “the right to access any personal data stored about them.”

For example, Article 12 (European Union Directive 95/46/EC) reads, “Member States shall guarantee every data subject the right to obtain from the controller: […] knowledge of the logic involved in any automatic processing of data concerning him. […] When applied specifically to search engines, users must have the right to access any personal data stored about them.” Which brings Hoboken to the question: why are we so passive in enforcing these rights?

Exercising Our Rights To Our Data
In 2006 AOL purposefully released 20 million partially anonymized search queries. Hoboken reminded us of how seemingly innocuous data can be pieced together and traced back to our identifies, as was the case with AOL user #4417749 who, based on the content of her web queries, was later identified as 62-year old widow Thelma Arnold. Online you exist as a number, but data is never completely anonymized. Hoboken laments, “AOL thought it would be good for researchers, and it’s a bit unfortunate that the backlash from this experiment means that it is now much harder for the public to get a hold of search data. Search data that is important for scholars to do do research.”

Although there are opportunities for accessing this information, the application of law sometimes falls short. Hoboken lists three problems that we run into when attempting to access our data from search engines, “These companies are opaque, divorced from reality, and they advocate data storage with reference to repressive purposes.” In his presentation he points to examples of these problems echoed in Google’s retention policy and in an NPR with Google co-founder Eric Schmidt (see slides * coming soon).

“We really have to worry about the extended amount of data being stored,” says Hoboken, “but fortunately there are many laws already established to protect us and our data.” He challenges us to take advantage of these laws and to ask more questions. And while it may not be possible to anatomize the data being collected, the fate of online privacy lies in our understanding of these laws and in our ability to exercise the rights that will protect our data from being (ab)used.

Joris van Hoboken: Does privacy still exist in an environment of search?

Degoogling Android: a practical how-to by Paul Sulzycki

Astrid Mager about European data protection standards

Open Web Index Manifesto