conference reports

Deep Search ll: Panel 4, Contextual Modeling and closing discussion

Posted: July 11, 2010 at 5:04 pm  |  By: Shirley Niemans  |  Tags: , , ,

This entry is part 4 of 3 in the series Deep Search ll

Panel 4: Contextual Modeling

An unstorable and unmanageable amount of data is coming at us, bringing with it a host of new strategies for grasping and analyzing the huge amount of bits and bites, such as visualization models.

mc schraefel: Beyond Keyword Search
Dr schraefel is reader in the Intelligence, Agents and Multimedia Group at the University of Southampton, UK
.

Screen shot 2010-07-11 at 4.55.11 PMSchraefel first emphasizes that in contrast to what people may assume from a visualization expert, she is not ‘in love with graphs’ and actually most of the time, big fat graphs suck. The research she will present here deals with the circumstances of serendipity. Following the idea that ‘fate favors the prepared mind’, she argues that discoveries never happen by chance and an important challenge lies in designing tools that support serendipitous discovery.

She then presents the audience with a 1987 video by Apple computers, which introduces the ‘Knowledge Navigator’; a tablet-like personal device with a natural language interface, a virtual ‘digital assistant’ and access to a global network of information. Outdated as the device may seem today, the digital assistant seemed able to create graphs by getting data out of its embodied context (such as other people’s documents), and be mined and combined to answer a variety of questions. In 1987, schraefel comments, this was a vision of exploration, heterogeneous sources, representation and integration that still inspires research into knowledge building today.

Schraefel notes how Google is the current search paradigm – “what else do you need?”. Drawing a parallel, she notes how Newton’s model of Mathematica set the tone for seeing the world for ages until it turned out that in some spaces, the model was flawed. It is much the same with Google’s document-centric, single source search without interrelations – the model frames the questions that may be asked. In order to enable knowledge gathering, we need a different one.

In a 2005 Scientific American article, Tim Berners-Lee, Ora Lassiler and Jim Hendler introduced machine readable mark-up and the Semantic Web as a new paradigm that moved away from keyword search and toward structured data and ontologies. Ontologies in this sense are subject-predicate-object joints, such as a composer-is a–person, or a person-has a-name etcetera. By giving data a rich (and often multiple) metadata context and using some logic, one may infer properties to objects that are not explicitly labeled, and enable knowledge gathering from heterogeneous sources.

Does this imply a reprise of Victorian taxonomies? Nope, quoting schraefel: “it is more pomo than that”, objects are described from multiple contexts. There is no über-ontology and we are slowly learning to be ‘ok’ with the fact that we don’t know everything controllably, and be messy. Following Berners-Lee, she emphasized the importance of liberating our data; placing sources freely on the web so that we may ask questions other than the document kind, and create information rather than merely retrieve it.

Read the rest of this entry »

Deep Search ll: Panel 3, Rent and Bias

Posted: June 26, 2010 at 4:03 pm  |  By: Shirley Niemans  |  Tags: , , ,

This entry is part 3 of 3 in the series Deep Search ll

Panel 3: Rent and Bias

After dwelling in the eighteenth, nineteenth and twentieth century in the morning panels, Felix Stalder comments that the program’s strict chronological order will now lead us into the twenty-first century. Keeping the metaphor of the map and the mapmaker alive, the next two speakers will talk about the politics and interests involved in processes of ranking, mapping and creating order in search results. Two such politics are ‘bias’ – why does a certain ranking exist – and ‘rent’ – how are all these practices transformed into a business.

Elizabeth von Couvering: Economic Bias in Search Results
Elizabeth von Couvering is a recent PhD graduate at London School of Economics
.

Screen shot 2010-06-13 at 7.31.46 PMContrary to the earlier presentations, Von Couvering’s talk shifts from what a search engine should be to what they are today. Her major concern is in the responsibilities information vehicles have to the public interest. Bias gets embedded in search results in a number of ways; first of all, search engines do not index the whole Web. Secondly, they do not index reliably. Furthermore, some engines systematically favor certain sites and the local advertising market has also proven to play a major role in the quality of the indexing process and subsequent size of the index: If you don’t have enough to offer, you will get a reduced quality of service. Search engines are a matter of public interest since they help people find things they don’t know about, and people are unsophisticated in their queries; they tend not to look beyond first page of results and tend to trust the rankings. Bias, then, has major implications.

Many early engines have merged over time. From 1996 on, media companies bought up search engines as they proved to attract large audiences. The ‘integrated portals’ that emerged were selling an audience to advertisers; the classic media model of production, packaging and distribution. Many search engines died under this audience-based model, as the engine itself was often not developed anymore. Currently we have moved toward paid performance advertising, pay-per-click, a traffic-based value chain. Google is no longer looking at an audience but at the movement of users from one site to another. Search engines have become online media giants with an incredible market share and ‘gaming the system’ has become a profitable professional activity.

What has been done to address the problem of bias? Von Couvering points towards search engine efforts to improve search quality by focusing on relevance and customer satisfaction. What constitutes a relevant result is based on a customer’s frame of mind. In terms of the technology, relevance is an objective indicator of search engine retrieval quality. Relevance – not fairness, diversity, objectivity or formative value for instance. Defining quality as relevance is problematic. You can’t succeed in working toward a less biased search engine, unless you get beyond the idea of relevance, and introduce an alternative mode of framing search results.

Von Couvering argues that there is the need for a discussion of professional codes of ethics for information scientists. Engineering goals are primarily described in terms of efficiency, or sometimes ‘elegance’. She feels that there is room for standards such as they exist in library science for instance, which is access for everybody, or perhaps in journalism where seeing both sides of a story is a central element for professional development. There is a need for public debate on an Internet that is other than a market place or a retail store, which she found was a recurring theme in her research. She concludes: “This is not information retrieval, this is sales.”.

Read the rest of this entry »

Deep Search ll: Panel 2, Sociometry, Networks and Classification

Posted: June 11, 2010 at 10:26 pm  |  By: Shirley Niemans  |  Tags: , , , ,

This entry is part 2 of 3 in the series Deep Search ll

Panel 2: Sociometry, Networks and Classification

In a brief introduction, Konrad Becker mentions that the first speaker of this panel, Greg Elmer, is unable to make it to the conference. He is happy to welcome Sebastian Giessmann, as his focus on networking as a ‘technique of the social’ works well within the context of the historical vision of organizing the world’s information; not only things but also people. We can see this development in the rise of mass society when the need to classify social relations emerged. Interestingly, the roots of these social classification systems are often murky and obscure – criminology for instance, has a background in rather non-scientific and occultist themes. Sociometry finds its origin in a slightly more progressive idea.

Screen shot 2010-06-11 at 10.22.59 PMSebastian Giessmann: From Sociometry to Social Networks. Networking as Technique of the Social. Sebastian Giessmann is a research fellow at the Excellence Cluster TOPOI in the cross-sectional group “Cultural Theory and its Genealogies” of the Humboldt-Universität zu Berlin.

As a cultural historian, Giesmann is professionally passionate for the occult. He specializes in the history of networks and networking, and intends to recount here the story of the ‘wild love affair’ between sociology and the visual form of the network diagram. His argument will frame both the history of knowledge and what William Mitchell has called ‘diagramatology’.

Net diagrams are systemic pictures, constantly reaching the boundaries of inscription spaces. They become network diagrams only if their nodes represent heterogeneous entities. Once a net consists of hybrid agents, interconnectivity and heterarchy become the standard, instead of ‘mere’ connectivity. But it is near impossible to draw the extendibility, aggregation and disillusion of networks in an iconic form – the classic conflict between time and space and media theory, and the reason why social images often resort to the dynamics of animation and simulation techniques. Grappling with the (im)possibilities of the topological, relational, visual form that has come to represent the network society is something Giesmann feels we must address critically and historically. Network diagrams offer only a measurement of sociality; out of the micro dimension of groups emerges the macro dimension of the network society.

Giessmann moves on to talk about the classics of sociology. Comte, Weber, Durkheim and Simmel rarely used graphics, but the 1930s introduced new methods with Jacob Levy Moreno’s psychological geography and Otto Neurath’s visual statistics, both as a way to deal with huge numerical datasets and to appeal to a wider audience. Moreno’s graphics imported the image practices of chemistry, making the ‘social atom’ the basic element for sociological visual augmentation.

Read the rest of this entry »

Deep Search ll: Panel 1, Visions of Organizing the World

Posted: June 11, 2010 at 3:09 pm  |  By: Shirley Niemans  |  Tags: , , ,

This entry is part 1 of 3 in the series Deep Search ll

ds1The second edition of the World Information Institute’s Deep Search conference series took place in Vienna on May 28. Where the first Deep Search symposium, held in November 2008 (find a review here) dealt with the history of information retrieval, the automatic classification of data, civil liberties, digital human rights, the power embedded in search systems and the visibility of online content, this second edition promised to look more deeply into both the history and future of classifying information, and large datasets.

Panel 1: Visions of Organizing the World

Introducing the first panel, Felix Stalder notes how the ‘grand title’ of the panel emphasizes an important issue; the urge to organize the world’s information is as old as human culture. Themes reemerge – organization cannot exist without an operating model and an array of judgments as to what constitutes information and knowledge. An historical perspective is important in this respect, as seemingly new issues are seldom unprecedented.

Chad Wellmon: Google Before Google, or, On the History of Search.

First speaker Chad Wellmon is Assistant Professor of Germanic Languages and Literature at the University of Virginia
.

Wellmon starts off by quoting a New York Times article in which a Media Studies professor claims that Facebook’s unwillingness to let Google crawl part of its content threatens the open and democratic arrangement of information on the Web. To such advocates the hyperlink is no more than a ballot, an embodiment of freedom. To the individual user however the Web in its fullness does not exist. Active linking confers a structural integrity to one document, and not to another. The hyperlink method of organization may be said to be less hierarchical than categorization, but to say that the Web is democratic in nature is to ignore the means by which we access it. Search technology and linking make the Web seem smaller and more manageable than it is, and highlight its fundamentally contingent nature.

In order to gain a historical perspective on all this, Wellmon traces the history of search technology, “a story of constraint and expansion”, back to what he feels is the prototype of the Web’s hyperlink: the eighteenth century footnote. The enlightenment project is a complex of footnotes and citations, one pointing to the next. Reflexivity is in the footnote. Books ‘talked to each other’ in a constant citing process in which the relevance of one text was decided by footnotes which point toward other texts. Reading enlightenment as a series of technologies to manage the intense proliferation of information however invites the question; what kind of knowledge is deduced from this citational logic?

Using a recent computer visualization of the citation process within an eighteen century encyclopedia, Wellmon shows the emergence of multiple subsystems within the encyclopedia, exposing the double character of search technology: Citing leads to inner circling, it establishes an inside and an outside, inclusion by means of exclusion. This double logic, Wellman suggests, may well produce the distinction between information and knowledge.

Read the rest of this entry »

Lecture David Gugerli – The Culture of the Search Society

Posted: November 30, 2009 at 5:28 pm  |  By: margreet  |  Tags: ,

Data Management as a signifying practice
David Gugerli, ETH Zurich
November 13, 2009, Amsterdam

Edited by: Baruch Gottlieb

Databases are operationally essential to the search society. Since the 1960’s, they have been developed, installed, and maintained by software engineers in view of a particular future user, and they have been applied and adapted by different user communities for the production of their own futures. Database systems, which, since their inception, offer powerful means for shaping and managing society, have since developed into the primary resource for search-centered signifying practice. The paper will present insights into the genesis of a society which depends on the possibility to search, find, (re-)arrange and (re-)interpret of vast amounts of data.

Download here the full lecture of David Gugerli given during the Society of the Query conference on Friday the 13th of November 2009.

Siva Vaidhyanathan on Googlization, “Only the elite and proficient get to opt out”

Posted: November 19, 2009 at 7:13 am  |  By: Chris Castiglione  |  Tags: , ,

Society of the QueryThe term Googlization, according to Siva Vaidhyanathan, is the process of being processed, rendered, and represented by Google.

Vaidhyanathan’s upcoming book The Googlization of Everything investigates the actions and intentions behind the Google corporation. This afternoon at The Society of the Query Vaidhyanathan choose one issue from his book: the politics and implications of Google Maps’ Street View.

According to EU law: there cannot be any identifiable information about a person in Google Street View. Google’ s standard defense up till now has been that they respect privacy by scrambling faces and license plates, to which Vaidhyanathan commented,

In my former neighborhood in New York there were probably 50 illegal gambling institutions around. Now, imagine an image of me on Google Street View taken in proximity to one of these illegal places. I’m more than two meters tall and I’m a very heavy man. You could blur my face forever, I’m still bald. In New York, usually I was walking around my neighborhood with a white dog with brown spots, everyone in the neighborhood knew that dog. So you could blur my face and it still wouldn’t matter – it’s me, I’m obviously me. Anonymization isn’t an effective measure, as we’ve already found out with data. (most likely referring to the AOL case of user #4417749)

Just this morning Swiss authorities made a statement that they plan on bringing a lawsuit against Google in the Federal Administrative Tribunal because Google isn’t meeting the country’s demands for tighter privacy protection with Google Street View. Vaidhyanathan commenting on the news said, “Google Street View has been entering so many areas of friction and resistance – this brings it to our attention that the game is over for Google.”

Vaidhyanathan’s criticism of Google Street View continued with Google’s trouble in Tokyo. “The strongest reaction against Google Street View has been in Japan,” he said, “Google will scrap all of their data from Japan and re-shoot the entire country. Google mismeasured how the Japanese deal with public space. In the older sections of Tokyo the street in front of one’s house is considered the person’s responsibility, it is seen as an extension of their house. Thus, Google Street View is actually invading someone’s private space.”

Earlier this year Google CEO Eric Schmidt made the following remark about the international appeal of Google,

The most common question I get about Google is ‘how is it different everywhere else?’ and I am sorry to tell you that it’s not. People still care about Britney Spears in these other countries. It’s really very disturbing.

Vaidhyanathan explained this as being a part of Google’s protocol imperialism,

Google isn’t particularly American, nor is it particularly American / Western European. It’s important to remember that Google is much more a factor of daily life in Europe. In the United States it is just barely 70% of the search market, in Western Europe it is around 90% and in places like Portugal it is 96% and I don’t know why.

For Vaidhyanathan the biggest problem with Google is that as it expands into more parts of the world that are less proficient, and less digitally inclined, there will be more examples of friction and harm because more people are going to lack the awareness to cleanse their record.

It’s important to note that Google does offer services for protecting and managing user data:

Vaidhyanathan didn’t specifically mention these options, but briefly acknowledged the existence of such tools before quickly moving onto the strongest part of his argument, “We in this room are not likely to be harmed by Google because all of us in this room are part of a techno-cosmopolitan elite. Only the elite and proficient get to opt out.”

Google Street View Fail

In closing, Vaidhyanathan exemplified the problem with a photograph of a man caught on the side of a U.S. highway and commented, “This man doesn’t know that he is in Google Street View so we get to laugh at him. Not knowing is going to be the key to being a victim in this system.”

More information about Siva Vaidhyanathan and his criticism of Google can be found on his website, and in this lively Google debate at IQ2 and New York Times article from last year.

Antoine Isaac: Semantic Search for Europeana

Posted: November 17, 2009 at 5:02 pm  |  By: Tjerk Timan  |  Tags: , ,

Society of the Query

Thanks for the opportunity to talk. I work at the VU and I am talking about the project Europeana. This is the result of a teamwork of the University, I am just presenting it.

Introduction
What is Europeana? It is a portal which that want to interconnect museum archives. Access to digital content. Currently there are 50 providers. and the number is growing. 10 million objects is the target. Maybe from a more practical : we want to create access but we also want create channels to other websites and so on. Such a thing does not go without challenges. The very issue of providing access that is very difficult. They are of an iterational nature. And how to get data besides the pictures? The method is to use metadata. Antoine shows the current portal, which he explains as a “basic search box” (picture needed). If a search query is done, different result are given that are linked to the search (pic, books etc). You can start refining you search by filtering (language, data and so on). This is called semantic search and it allows you to refine your search. To some extend this is not matching the richness of data that is out there in the databases. The idea is to go a step beyond semantic enables search. Some functionalities are explained, such as clustering. Antoine explains that by exploiting semantics, we can exploit relations that are stored in the objects. We can use information that is actually there already in the meta data. Some kind of organized knowledge is already there, we want to exploit it. The proper information properly accessible , that is the goal.

Example
A semantic layer on top the ‘normal’ results is presented. A graph is shown of a semantic web. It needs to become more useful for users, according to Antoine. A common concept that can aggregate for relations. A screen shot is given of the prototype. It is a mini-version of the total project: three museums are currently represented. You can start typing your search. The first difference ( from normal search engine red) is that it will be able to provide you with concepts and locations that could match your string. If you select one of the results , you get a number of new possible links and clusters via criteria. It is important to notice that the results are coming from really specific entities. We can see that the subject “egypt” for example gives a whole set of related objects. It is much more than a returned string.

This idea of having controlled entities can be used in more complex means. Users can  start exploring further knowledge and concepts.  An example is given on the search “egypt’ and the meta results. We are now searching via concept/relations.  This is an example of richer information. I also got clusters like works created by somebody who was in Egypt and so on… The reason for getting this object in the results is that in the metadata links back to the subject (query). There is a kind of person space emergent here.  Via this person, we can find out the place and we end up in Cairo. One very important point is that we benefit from existing models and vocabularies. Via labels on concepts, these concepts can be linked. It is very important because now you can access this information. We continue by determining these links (exact matches and relational matches). The main advantage of metadata is that it is heterogeneous. There a different description models. You cannot really anticipate it. Some form of alignment is required in order for the system to work, because these databases use different vocabularies. A data cloud is presented which represents the different vocabularies in the three different museums. These vocabularies are glued together.

Conclusions
The semantics in our case are getting structure in the data. It is about coupling the data.. It is a flexible architecture. It is about loading data. This makes ingestion for new data easy.  You don’t need to fully merge the workings of all the institutions/ content providers.  It is about connecting structures together. It allows easier access to the different vocabularies. You can start your search and you are provided with different vocabularies. Next, we have to bring in more vocabularies. You can have quality data in this system.  Finally, this vision  of the variable links model is nice, but some semantic matching level problems occur. This is difficult. A link is given: here you can try the portal here

Questions
Rogers: Don’t you need an army if you want to actually make the links and translation between all the records?
Isaac: you are right, we actually implemented something (the three museums vocabularies), we are not experts on library science. Until recently, however, the library scientist did not come out of their institutions. Now, they start to realize they can integrate their knowledge. I believe this is an added value.

Rogers: Is this more than digitizing library systems? Is this indexible by Google?
Isaac: Yes, it should be.
Rogers: is it deep indexible? isn’t this a huge policy question?
Isaacs: This prototype publishes the data. You can see the source of the data.

Pembleton: analogy: Tim Bernes-Lee created a website that can point to all your data. What I see here is the- same move. By linking the concepts, not the data. This provides a richer web.
Rogers: Is this a Europe-island web, then?
Cramer: We already have such a system: it is called RSS.

Audience: A method that I see here is: we need glue to link existing concepts and vocabularies. The other is to generate new vocabularies . To me that seems to be a large debate.
Pembleton: We use the same underlying technology.  I see more added value rather than competition.
Cramer: RDFA is not a vocabulary, it is a language to format the vocabulary (which is a huge difference).

Michael Stevenson presents a Google art expose

Posted: November 16, 2009 at 4:15 pm  |  By: Rosa Menkman  |  Tags: ,

Society of the QueryMichael Stevenson is a lecturer and PhD candidate at the Department of Media Studies, University of Amsterdam. For the Society of the Query evening program he presented a very interesting selection of artistic and activist projects that were engaged with (the re-attribution of) different elements related to Web search.

Query

The IP-Browser (Govcom.org) for instance played with the linearity of querying the Web. It creates an alternative browsing experience that foregrounds the Web’s machine habitat and returns the user back to the basics of orderly Web browsing. The IP Browser looks up your IP address, and allows you to browse the Websites in your IP neighborhood, one by one in the order in which they are given in the IP address space.

Shmoogle (Tsila Hassine/De Geuzen) also deals with linearity on the Web, specifically the linearity of the search returns of Google. De Geuzen state that the best search returns that Google offers are not necessarily always the ones at the top. Unfortunately this is where the average Google user gets stuck. Shmoogle offers a way to find the search results in a chaotic way, and in doing so it ensures greater democracy.

The Internet Says No (Constant Dullaart) is a animated, fully functioning Google page (Google is placed in a marquee-frame). this work offers a pessimistic way to surf the internet.

The Misspelling-Generator (Linda Hilfling & Erik Borra). Erik Borra presented the work as a result of the fight against internet censorship. When doing a search in the Chinese version of Google on the Tiananmen Square Massacre, Linda Hilfling discovered a temporary loophole out of the Google self-censorship in China. By deliberately spelling Tiananmen incorrectly, she was taken to web-pages where other people had misspelled Tiananmen, and was thereby able to access pictures of demonstrations as well as the legendary image of the student in front of the tank through the sources of incorrect spellings. The Misspelling generator is a tool that can be used for internet activism. By writing variations like ‘tianamen’ and ‘tiananman’ the isolation politics of the Google’s spelling corrector can be subverted while Google’ selfcensorship can be circumvented.

Society of the Query

Images

Z.A.P. (ApFab) is an automatic image generation installation. First you add a word using the ApFab touch-screen, then the ZapMachine will grab an image from the Internet. This image is the most important visual representation of that word, at that time, according to the current Internet authority Google. Finally, the individual images are incorporated into a new context, creating a new tense state of meaning and random relations. With “Zapmachine: Who gave you the right?” AbFab is asking the following questions:

-How much control do we have over the generated collage as artists?
-How much influence do you have on this process.
-How does the collage relate to the initial intention by which the image was uploaded on the Internet by the original author?
-Who is the author of this Zap collage?

Disease Disco (Constant Dullaart) “To every suffering its thumbnail”. Dullaart used the Google image search by color option, to query the word ‘disease’ and changes color ‘rhytmically’. The work is accompanied by the US billboard #1 hit song of the moment that the work was created.

The Global Anxiety Monitor (De Geuzen) uses html-frames to display automated image searches in different languages. Searching in Google for terms such as conflict, terrorism and climate change, this monitor traces the ebb and flow of fear in Arabic, Hebrew, English and Dutch.

Terms & Conditions

Cookie Monster (Andrea Fiore) To capture on-line behavior, thousands of HTTP cookies are sent daily to web browsers to identify users and gather statistical knowledge about tastes and habits. The cookie consensus website hosts a collection of cookies that Andrea Fiore received while surfing through the first 50 entries of the Alexa directory of News sites. In the future it will also host a software that will give the users the capability to create their own cookie collections.

I Love Alaska (Lernert Engelberts & Sander Plug) is a beautifully framed internet movie series that tells the story of a middle aged woman living in Houston, Texas. The viewer follows her AOL search queries over the time span of months. “In the end, when she cheats on her husband with a man she met online, her life seems to crumble around her. She regrets her deceit, admits to her Internet addiction and dreams of a new life in Alaska.”

Society of the Query

http://www.geuzen.org/anxiety/

Discussion session 2: Googlization

Posted: November 16, 2009 at 12:16 am  |  By: Tjerk Timan  |  Tags: , , , , , ,

With: Siva Vaidhyanathan, Martin Feuz  and Esther Weltevrede

Moderated by Andrew Keen.

Society of the Query

Moderator: Why does no one talk about money?

Vaidhyanathan: Google only loses money. They have an interest to keeping people interacting with the Web. As long as you are interacting with the web, they can track you via cookies and that puts more data in their database. It is a clear but third degree connection for creating revenue. It also has interest in data- and text accumulation. It hopes to create a real text-based search. In terms of Google search; global and local are not really put to for example; Google books. This already biases the search results.

Weltevrede: It also depends on your perspective on Google. For me it is interesting to see how it works. How does it organize and present the information we get.

Vaidhyanathan: nobody is going to Google for the ads.

Audience (at Weltevrede): you were depending on the Google translation results?  Isn’t that tricky?

Weltevrede: indeed,  Google Translate is still in beta version. However, human rights is such an important term that one can assume that it is translated well.

Society of the Query

Audience: how about methods? It is difficult to pose yourself against the machine. All of us here agree that searching sucks and that Google is bad and commercial. So I’d like to have some reflection on methods in order to be critical against searching and how they relate to your research?

Vaidhyanathan: Google is hard to study in traditional way. I do my best to keep to fuzzy, flabby arguments of narrative and argument. Opacity is the problem of Google. You cannot research is without using it. You risk becoming a banned user. You have to warn Google about your research, in which you may alter the results.

Weltevrede. I agree, I want to add that you can study the inner workings by looking at output, you can tell a lot about that

Feuz: it is an attempt to look at temporal relations: You have to try and fund ways to be able to ask these questions.

Society of the Query

Moderator; What I do not understand is the connection between the most opaque company ever which are still fetishing transparency.

Vaidhyanathan: it does not fetishize it; it leverages it. We do the work for Google, we provide the information and content Marx would scream at this notion. We are all very happy to do it (user-generated content). It is a better environment than we used to. However, we have to grasp the workings. Maybe we are very content with our relation to Google.

Weltevrede: it is also what building tools you can get out of Google. You can make use of the giant – building on Google; let Google work for us again.

Manovich (audience): I have difficulty to see your (Feuz’s and Weltevrede’s) results as research. What is the sample size? Where is the statistical data? You haven’t looked at the interdependencies of the variables? So what kind of science is this? If these things are not clear, these results are not meaningful.

Feuz: there is a difference between types of research. In the kind of research I did, I worked 4 month in a team gathering data. The amount of data we needed was already overwhelmingly large. You have to keep in mind that the thing is really temporal.

Vaidhyanathan (at Manovich): Is it not very expensive what you do? How can you do this?

Manovich: Most things are done in open source software and only takes five minutes.

Rogers (audience): Responds to the question by Manovich on what kind of science this is: it is diagnostics! Are local Googles furnishing local sources? It is a kind of critical diagnostics to see how Google works, and to see at the implications.

Manovich: Is it then issue exploration to be followed by hypothesis?

Moderator: I live in Silicon Valley, There is more skepticism there about Google. They cannot fight the real-time twitter economy. What is the relevancy of Google right now? What are your thoughts about this? Will it remains strong?

Vaidhyanathan: I am very bad at predicting. For the sake of my book, I hope they stay relevant? The rapid changes of Google have made me realize I must not write about current companies anymore. You have to keep in mind, though, that the real time web is not that connected (yet). So much of what Google tries to do is to satisfy the Cosmo-elite because this group makes the choices and the critics. What are the initiatives that Google has in India, China and Brazil? That is a more relevant development to look into.

Feuz; we researchers cannot cope with the patterns of change – they can adopt fast, so they will survive.

Society of the Query

Esther Weltevrede: “The Globalisation Machine. Reinterpreting engine results”

Posted: November 16, 2009 at 12:07 am  |  By: Tjerk Timan  |  Tags: , ,

Society of the Query

Lecture by Esther Weltevrede
As a follow up on Martin’s talk, I am going to present some empirical works. These project concern comparing engine results and customization of location. The aim of this study is:

1) Building on Googlization theory and search engine critique.
2) Empirical study. Reinterpreting Google for localization studies.

The key question is: What type of globalization machine is Google?
In this light, a couple of cases will be presented. Weltevrede starts by posing that PageRank is Googles way into the information environment. In an article published in 1998/1999 (?) PageRank is mentioned as the global ranking system for all pages, specifically designed for all the info of the world. Although Google states that they use a large number of variables, PageRank is primarily based on the link. The question of this research is: When Google moves to the local, what happens to the local results? What if we look at some services that are local:

A case:
Google “Amsterdam” and you get (respectively) red light, airport, coffee shops. This same query in Google.nl returns another set of results (arena, forest, tournament). Local domain Google is another method of localization (e.g. Google.de). There are 157 local Googles. The key variables are (as far as can be distilled: Google is not very transparent in providing this information):

  • IP address
  • top level domain
  • webmasters page

If you visit one of these Googles (say, Google.be), you can also select pages from that locale (only provide me with result from Belgium, for instance). If you select this option, you get local results according to Google. Also, we notice that this particular Google is recommended in three languages (French, Dutch and German, in this case). Another way that Google returns local results is via ‘region” and of course a yellow-page kind of search is offered via Google maps. In answering what we can say about the type of machine Google is, Weltevrede states that it thinks globally and acts locally.

The first case study:
Local and international information sources. Research question: to what extend can the local domain Google present local results? Method used: query all the national Googles in their official languages. Then, in Google Translate, the search term is translated. The second step is to geo- locate sources. Instead of choosing for host, we looked at registration of the website. This is a more precise indication of who owns the website. The top ten results for the query “human rights”.

A picture is shows about the results. The selected national Google is Canada:

map canada

This map indicates that Canada has relatively many local results for ‘human rights’. We can also look at what the top results are globally. The UN is by far the largest source in the list. When we looked at the results more closely, the declaration of human rights keeps popping up. Often websites have translated the declaration in all languages they all call upon this source (On e can interpret this as a way of SEO) .

Next, a ranked tag cloud is shown.
weltevrede_hr_tagCloud
We looked at these sources and blue- tagged sources contain the declaration of human rights. Next, a rank list of all countries queried is given. 40 % of all national Googles do not get any local results. If you look at the type list, you see that Europe leads the list, while at the lower end it is mostly African and Middle- Eastern countries. We can see that the local domain does not mean that you receive local information sources. How then are local results defined? Is it maybe language? A search is done on all Arabic countries. This shows a language web – a shared language space. Does that mean that there are no local sources? In Lebanon, the term “human rights” again is queried. While this does return results, these results do not make it to the top. Local sources are on the second page and beyond.

In order to test this claim (language) we looked at a geographical regions defined by languages: Europe is chosen due to its local and distinct languages. The visual below shows they have very local sources (Again indicated by black domain names). The EU- Googles hardly share sources – characterized by their local sources. This can be argued as a being a language web.

weltevrede_hr_tagcloud_eu

We now move to the last example: comparing two Portuguese speaking sources. Portugal compared to Brazil: Here we might conclude that the established local sources are privileged. Language webs prefer over local webs.

weltevrede_hr_Brazil_mappa weltevrede_hr_Portugal_mappa

Search engine literacy (2nd case study)
We can use Google to read society; we have a particular way of interpreting search engine results. One example method: reading domain names and their main issues. Again, the example of human rights is used here. If we query this, we see a very established result list, where sources are relatively fixed. What happens when we query for a more dynamic topic? In this case a query is done on RFID in 2004;  back then, this was a young space. We see sources competing for the top. Compared to the human rights space, it has rather young and technology-minded sites; the build- up of the list is really different. Another method for research is to look at issues returned:

weltevrede_rfid_sources

A third case study:
A query for “rights’ is performed. What is the top ten list of rights per country? The total list as shown. This research required reading and interpreting languages by the team members. The top ten of prominent rights in local domains were collected and visualized. The total image is shown.
weltevrede_rights_visual
The color code – blue rights are shared, while the black ones are culturally specific for domains.

If we zoom in, we see that in Italy, the unique rights are the right to forger and the right to nationality. In Japan, they have computer programming rights, for instance. In Australia, you have specifically man’s rights One favorite: Finland’s every-mans right to freedom to roam in nature. If we are to draw conclusion from this case study, they would be: the globalizing machine can show the shared as well as the locally specific. Google is localizing, regional, nation and local, showing shared and specific. Local results does not mean local sources. Also, different regions on the web are offered, mostly via language.

For more information, see Govcom.org and Digital Methods Initiative. DMI Project page on The Nationality of Issues. Repurposing Google for Internet Research.