Deep Search ll: Panel 4, Contextual Modeling and closing discussion

Posted: July 11, 2010 at 5:04 pm  |  By: Shirley Niemans  |  Tags: , , ,

This entry is part 4 of 3 in the series Deep Search ll

Panel 4: Contextual Modeling

An unstorable and unmanageable amount of data is coming at us, bringing with it a host of new strategies for grasping and analyzing the huge amount of bits and bites, such as visualization models.

mc schraefel: Beyond Keyword Search
Dr schraefel is reader in the Intelligence, Agents and Multimedia Group at the University of Southampton, UK
.

Screen shot 2010-07-11 at 4.55.11 PMSchraefel first emphasizes that in contrast to what people may assume from a visualization expert, she is not ‘in love with graphs’ and actually most of the time, big fat graphs suck. The research she will present here deals with the circumstances of serendipity. Following the idea that ‘fate favors the prepared mind’, she argues that discoveries never happen by chance and an important challenge lies in designing tools that support serendipitous discovery.

She then presents the audience with a 1987 video by Apple computers, which introduces the ‘Knowledge Navigator’; a tablet-like personal device with a natural language interface, a virtual ‘digital assistant’ and access to a global network of information. Outdated as the device may seem today, the digital assistant seemed able to create graphs by getting data out of its embodied context (such as other people’s documents), and be mined and combined to answer a variety of questions. In 1987, schraefel comments, this was a vision of exploration, heterogeneous sources, representation and integration that still inspires research into knowledge building today.

Schraefel notes how Google is the current search paradigm – “what else do you need?”. Drawing a parallel, she notes how Newton’s model of Mathematica set the tone for seeing the world for ages until it turned out that in some spaces, the model was flawed. It is much the same with Google’s document-centric, single source search without interrelations – the model frames the questions that may be asked. In order to enable knowledge gathering, we need a different one.

In a 2005 Scientific American article, Tim Berners-Lee, Ora Lassiler and Jim Hendler introduced machine readable mark-up and the Semantic Web as a new paradigm that moved away from keyword search and toward structured data and ontologies. Ontologies in this sense are subject-predicate-object joints, such as a composer-is a–person, or a person-has a-name etcetera. By giving data a rich (and often multiple) metadata context and using some logic, one may infer properties to objects that are not explicitly labeled, and enable knowledge gathering from heterogeneous sources.

Does this imply a reprise of Victorian taxonomies? Nope, quoting schraefel: “it is more pomo than that”, objects are described from multiple contexts. There is no über-ontology and we are slowly learning to be ‘ok’ with the fact that we don’t know everything controllably, and be messy. Following Berners-Lee, she emphasized the importance of liberating our data; placing sources freely on the web so that we may ask questions other than the document kind, and create information rather than merely retrieve it.

Read the rest of this entry »

Share This Post

Define: Web Search, Semantic Dreams in the Age of the Engine

Posted: July 8, 2010 at 12:25 pm  |  By: Shirley Niemans  |  Tags: , , ,

During my research internship at the Institute of Network Cultures in 2008/2009, I was given the opportunity to explore the broad field of Web search using the Institute’s elaborate network and the extensive knowledge of its staff, and to deliver an editorial outline for the Society of the Query conference. This research also culminated in an MA thesis in December 2009 that has recently become available for downloading at the Igitur Library of Utrecht University. Please find an abstract below, and a download link here.

Abstract: In 2000, Lucas Introna and Helen Nissenbaum argued that search engines raise not just technical, but distinctly ethical and political questions that seem to work against the basic architecture of the Web, and the values that allowed for its growth. Their article was the starting point of a critical Web search debate that is still gaining foothold today. When we consider the semantic metaphor that has been inspiring a refashioning of the Web architecture since 2001, we can see the exact same values of inclusivity, fairness and decentralization reappear that fueled the development of the original WWW. This thesis will explore the ‘promise’ of the Semantic Web in light of the current debate about the politics of Web search. I will argue that a balanced debate about Semantic Web developments is non-existent and that this is problematic for several reasons. Concluding the thesis, I will consider the dubious position of the W3C in enforcing the implementation of new standards and the power of protocol to be an ‘engine of change’.

Share This Post

Deep Search ll: Panel 3, Rent and Bias

Posted: June 26, 2010 at 4:03 pm  |  By: Shirley Niemans  |  Tags: , , ,

This entry is part 3 of 3 in the series Deep Search ll

Panel 3: Rent and Bias

After dwelling in the eighteenth, nineteenth and twentieth century in the morning panels, Felix Stalder comments that the program’s strict chronological order will now lead us into the twenty-first century. Keeping the metaphor of the map and the mapmaker alive, the next two speakers will talk about the politics and interests involved in processes of ranking, mapping and creating order in search results. Two such politics are ‘bias’ – why does a certain ranking exist – and ‘rent’ – how are all these practices transformed into a business.

Elizabeth von Couvering: Economic Bias in Search Results
Elizabeth von Couvering is a recent PhD graduate at London School of Economics
.

Screen shot 2010-06-13 at 7.31.46 PMContrary to the earlier presentations, Von Couvering’s talk shifts from what a search engine should be to what they are today. Her major concern is in the responsibilities information vehicles have to the public interest. Bias gets embedded in search results in a number of ways; first of all, search engines do not index the whole Web. Secondly, they do not index reliably. Furthermore, some engines systematically favor certain sites and the local advertising market has also proven to play a major role in the quality of the indexing process and subsequent size of the index: If you don’t have enough to offer, you will get a reduced quality of service. Search engines are a matter of public interest since they help people find things they don’t know about, and people are unsophisticated in their queries; they tend not to look beyond first page of results and tend to trust the rankings. Bias, then, has major implications.

Many early engines have merged over time. From 1996 on, media companies bought up search engines as they proved to attract large audiences. The ‘integrated portals’ that emerged were selling an audience to advertisers; the classic media model of production, packaging and distribution. Many search engines died under this audience-based model, as the engine itself was often not developed anymore. Currently we have moved toward paid performance advertising, pay-per-click, a traffic-based value chain. Google is no longer looking at an audience but at the movement of users from one site to another. Search engines have become online media giants with an incredible market share and ‘gaming the system’ has become a profitable professional activity.

What has been done to address the problem of bias? Von Couvering points towards search engine efforts to improve search quality by focusing on relevance and customer satisfaction. What constitutes a relevant result is based on a customer’s frame of mind. In terms of the technology, relevance is an objective indicator of search engine retrieval quality. Relevance – not fairness, diversity, objectivity or formative value for instance. Defining quality as relevance is problematic. You can’t succeed in working toward a less biased search engine, unless you get beyond the idea of relevance, and introduce an alternative mode of framing search results.

Von Couvering argues that there is the need for a discussion of professional codes of ethics for information scientists. Engineering goals are primarily described in terms of efficiency, or sometimes ‘elegance’. She feels that there is room for standards such as they exist in library science for instance, which is access for everybody, or perhaps in journalism where seeing both sides of a story is a central element for professional development. There is a need for public debate on an Internet that is other than a market place or a retail store, which she found was a recurring theme in her research. She concludes: “This is not information retrieval, this is sales.”.

Read the rest of this entry »

Share This Post

Deep Search ll: Panel 1, Visions of Organizing the World

Posted: June 11, 2010 at 3:09 pm  |  By: Shirley Niemans  |  Tags: , , ,

This entry is part 1 of 3 in the series Deep Search ll

ds1The second edition of the World Information Institute’s Deep Search conference series took place in Vienna on May 28. Where the first Deep Search symposium, held in November 2008 (find a review here) dealt with the history of information retrieval, the automatic classification of data, civil liberties, digital human rights, the power embedded in search systems and the visibility of online content, this second edition promised to look more deeply into both the history and future of classifying information, and large datasets.

Panel 1: Visions of Organizing the World

Introducing the first panel, Felix Stalder notes how the ‘grand title’ of the panel emphasizes an important issue; the urge to organize the world’s information is as old as human culture. Themes reemerge – organization cannot exist without an operating model and an array of judgments as to what constitutes information and knowledge. An historical perspective is important in this respect, as seemingly new issues are seldom unprecedented.

Chad Wellmon: Google Before Google, or, On the History of Search.

First speaker Chad Wellmon is Assistant Professor of Germanic Languages and Literature at the University of Virginia
.

Wellmon starts off by quoting a New York Times article in which a Media Studies professor claims that Facebook’s unwillingness to let Google crawl part of its content threatens the open and democratic arrangement of information on the Web. To such advocates the hyperlink is no more than a ballot, an embodiment of freedom. To the individual user however the Web in its fullness does not exist. Active linking confers a structural integrity to one document, and not to another. The hyperlink method of organization may be said to be less hierarchical than categorization, but to say that the Web is democratic in nature is to ignore the means by which we access it. Search technology and linking make the Web seem smaller and more manageable than it is, and highlight its fundamentally contingent nature.

In order to gain a historical perspective on all this, Wellmon traces the history of search technology, “a story of constraint and expansion”, back to what he feels is the prototype of the Web’s hyperlink: the eighteenth century footnote. The enlightenment project is a complex of footnotes and citations, one pointing to the next. Reflexivity is in the footnote. Books ‘talked to each other’ in a constant citing process in which the relevance of one text was decided by footnotes which point toward other texts. Reading enlightenment as a series of technologies to manage the intense proliferation of information however invites the question; what kind of knowledge is deduced from this citational logic?

Using a recent computer visualization of the citation process within an eighteen century encyclopedia, Wellmon shows the emergence of multiple subsystems within the encyclopedia, exposing the double character of search technology: Citing leads to inner circling, it establishes an inside and an outside, inclusion by means of exclusion. This double logic, Wellman suggests, may well produce the distinction between information and knowledge.

Read the rest of this entry »

Share This Post

Berliner Gazette: Suchen, Spielen, Lernen by Konrad Becker

Posted: March 11, 2010 at 2:38 pm  |  By: Shirley Niemans  |  Tags: , ,

Suchen, Spielen, Lernen (to search, play and learn) is a recent essay by by Konrad Becker, director of World-Information.org and co-editor of the upcoming Deep Search ll symposium in Vienna as well as the upcoming volume Critical Strategies in Art and Media (Autonomedia, 2010). The essay is available (in German) at the Berliner Gazette: http://berlinergazette.de/suchen-spielen-lernen/.

Share This Post

Conrad Wolfram on Information, Computation and the New Era of Knowledge

Posted: March 5, 2010 at 3:16 pm  |  By: Shirley Niemans  |  Tags: ,

wolframalphaThe Berlin Transmediale event Ideologies and Futures of the Internet that took place on Saturday, 6 February featured a keynote lecture by Conrad Wolfram, Director of Strategic and International Development at the well-known software company Wolfram Research Inc., founded by his brother Stephen. At transmediale.10 Conrad Wolfram talked about the knowledge engine Wolfram Alpha and his vision of the future of knowledge and the Web.

The video of his lecture, entitled ‘Wolfram Alpha: Information, Computation and the New Era of Knowledge’, is available here.

In May 2009, Wolfram|Alpha launched to worldwide excitement. It introduced the new concept of “computational knowledge engine” — working out specific answers to queries made rather than picking out existing information like a traditional search engine.

Yet this is just one window into the Wolfram|Alpha project and its vision for all systematic knowledge: of curating, making computable and democratizing high-level use.

Which technologies make this feasible now? Can we democratize computational expertise as successfully as the web and search have democratized information retrieval? How will this affect the Knowledge Economy—business and government information, R&D and technical education?

Conrad Wolfram addressed these questions as well as explain the concept, workings, and progress of Wolfram|Alpha and its underlying Mathematica technology—built up over 23 years—that has made it possible.

Conrad Wolfram, physicist and mathematician, is strategic director of Wolfram Research as well as European managing director. Founded by his brother Stephen, Wolfram Research is the maker of Mathematica software and spin-off Wolfram|Alpha knowledge engine. A recurring theme for Conrad Wolfram has been the democratisation of computation through automation and interactivity. He argues that this direction will be increasingly important for technical jobs and everyday living; and that in turn this changes how we should teach mathematics and related subjects. He has led the effort to move the use of Mathematica from pure computation system to development and deployment engine, instigating technology such as the Mathematica Player family and webMathematica. He serves on the Computer science committee Advisory Board at Kings College London and was on the founding committee of the IMS conferences.

Share This Post

Teresa Numerico on Cybernetics, Search Engines and Resistance

Posted: November 13, 2009 at 11:54 pm  |  By: Liliana Bounegru  |  Tags: , , ,

Society of the QueryTeresa Numerico is a lecturer at the University of Rome, where she teaches history and philosophy of computer science and epistemology of new media. Her presentation brought a historical and philosophy of science perspective into the themes of this conference: web search, search engines and the society of the query. She attempted to see search engines today through the lenses of cybernetics. According to her, digital technologies today intertwine the cybernetics concepts of communication and control. Just as cybernetics had to deal with communication and control, so search engines today mediate between cooperation and monopoly.

But how more precisely is the cybernetics approach embedded into search engines? According to Teresa Numerico, there are areas in which search engines have a lot in common with the cybernetic approach to machines and creating a cognitive framework, such as: search engines are black boxes in that the ranking process is not transparent, the search function offers output almost automatically to external input, and the ranking algorithm hypothesizes the self-organization within the network.

By offering a strong cognitive framework, search engines are doing the work of the archive, hence her call for an “archaeology of techno-knowledge of search.” Her  notion is influenced by Foucault’s Archaeology of Knowledge. According to Foucault, “The archive is the first law of what can be said. […] But the archive is also that which determines that all these things said do not accumulate endlessly in an amorphous mass […]; but they are grouped together in distinct figures composed together in accordance with specific regularities.” (Foucault, 1969/1989: 145- 148).

Her main questions in relation to this direction of research into search engines were: Who controls the archive and its meanings?, as we have no control on the meaning that comes out this work; Who is defining the web society archive?, and ultimately, what is there to be done? According to Teresa Numerico, the only possible reaction is resistance. She concluded her presentation with a practical list of suggestions for potential actions of resistance which any of us can take: be creative, not communicative, in order to elude the control component of communication, as well as archiving and searching, minimize the number of online tracks that you leave, close internet devices every now and then, make efforts to vary your sources of knowledge by consulting different search engines, and maintain a cross-media orientation in order to verify the trust and authority of one source against others.

Society of the Query

Share This Post

A New View On Old Search Engines

Posted: June 16, 2009 at 11:22 am  |  By: Dennis Deicke  |  Tags: , , , ,

Review of Gugerli, D. (2009). Suchmaschinen. Die Welt als Datenbank. Frankfurt: Suhrkamp.

In his book Search Engines, The World as a Database (Suchmaschinen, Die Welt als Datenbank) the Swiss historian of technology David Gugerli describes the forerunners of Internet search engines in the second half of the 20th century exemplified by four different case studies. He starts with the examination of two German television shows, which Gugerli considers as early forms of search engines that were providing certain functions demanded for by the society. Furthermore, the author analyses the methods invented by the German BKA (The German Federal Criminal Police Office) in the early 1970‘s. Gugerli then explains the development of search engines using the idea of the relational data bank invented by Edgar F. Codd in 1969.

In the introduction Gugerli depicts the ubiquity of the search engine Google and all its additional services. Then he reminds the reader that before Google there have been different sorts of search engines that worked outside of the Internet. The detection of earthquake-zones or low-pressure systems for example was executed by satellites, sensors and simulations. Superstars and scandals were detected by TV-stations. Managers searched for information in corporate data bases, which were not open to everyone. Gugerli mentions that every type of search engine is situated in an area of conflict, between overview and surveillance. The author explains that search engines are connected with hopes concerning democratization, informational emancipation and complete overview. Contradictory they are also linked with fears regarding the vision of an Orwellian state of permanent observation. Gugerli identifies four functions that all search engines have in common. First of all, search engines premise that the aims of their operation can be objectified. Secondly, search engines operate in a concrete room of addresses. Search engines can only work, if they can link the searched object with an address. Thirdly, search engines follow a certain pattern, from which they cannot divert, but they simultaneously show a fundamental openness for results. Fourthly, search engines feature a special proximity to games and simulations.

The first case-study taken into consideration by David Gugerli is the old German TV-show: „Was bin ich?“ (What am I?), that had been aired between 1961- the year Gugerli was born –   and 1989 and hosted by Robert Lembke. The game-idea of the show was to let the audience guess which profession attendant persons in the show had. These persons had to display four characteristics of themselves at the beginning of the show: a signature, stating whether they are employed or self-employed, gesturing a situation typical of their job and selecting the color of a piggybank. During this procession the profession of the person was revealed to the TV-audience. A team composed of four (more or less) famous persons, who used these four different inputs to find out the person‘s job. They asked questions that could only be answered with „Yes“ or „No“, and for every „No“ the candidate received five DM (Deutsche Mark), which were put into a piggybank, whose color has been selected before. „Was bin ich?“ had been a very successful TV-show for almost 30 years. David Gugerli identifies an interesting reason for this success. He argues that in Germany people demanded for reliability of expectations, the audience had a desire for the certainty that professions and people could be linked. The structure of the show offered a method which was able to conjoin professions with persons exemplarily. Gugerli labels this possibility of linking jobs and persons as normal and therefore concludes that „Was bin ich?“ was a search engine seeking the „normal“ in German society. In a next traceable step Gugerli classifies this desire for reliability into the historic context in Germany. After World War II people searched for a new identity because the old structures of identification had vanished. Gugerli concludes that „Was bin ich?“ supported this process of self-discovery. It showed that the profession was a stable attribute of a person that could be discovered by using the simple mechanism of the show. Later on the society changed but the show stayed the same for almost 30 years and absorbed the complexity which had emerged because of social alteration beginning in the 60‘s. The mechanism of the show reduced the question for individual identity to what someone was, not who and in this way objectified the question.

The second case-study the professor at the Technical University Zurich (ETH Zürich) uses for illustration is the German TV-show „Aktenzeichen XY … ungelöst“. The show went on the air in October 1967 and was hosted by Eduard Zimmermann. In the show Zimmermann presented unsolved criminal cases which were re-enacted by performers. After a shown clip, the host talked to an expert of the police to give additional information to the audience. People sitting in front of the TVs were then requested to provide the police with relevant information. In this manner the show tried to find a delinquent based on the criminal practice and the traces of the crime. The consequence of this procedure was the reliability of expectations concerning the deviant, the aim of the search was connecting criminal work and the associated delinquent and to link his position with an address. In contrast to „Was bin ich?“ this show did not provide the audience with an image of the normal but with an image of the deviant. „Was bin Ich?“ was a search engine looking for the normal in society, while „Aktenzeichen XY“ was searching for the opposite, the deviant. And this is where Gugerli detects the entertaining potential of the show, by searching the deviant the show stabilized the amusing distinction between normal and abnomral. In the show the searched criminal did not fall under the presumption of innocence anymore, the show put everyone under general suspicion. The audience built a giant living network that provided information like a data bank with the advantage that it did not need to be fed with information by the police and Zimmermann before. The show objectified by considering cases and files, then it subjectified the cases again by re-enacting them with actors. After this simulation of the audience being witness of the crime, it was objectified again by the police expert who provided additional and real details regarding the case. 

As the third case-study exemplifying the function of a search engine David Gugerli selected the methods of the BKA (Federal Criminal Police office) that were invented when the new BKA-president Horst Herold started his work in 1971. Herold built up a giant computer data base system containing all information that had been collected by the german police. Using this background Herold created a search engine that should find statistically attestable patterns of the deviant. These results were supposed to serve as arguments for the prevention of crime and were the background for flexible manpower planning. Repression should be substituted by prevention, contention by dynamics, command by control, experience by logics and hypothesis by prognosis. Allocation of police resources followed the results of the analysis and the patterns that had been found out and were adapted flexibly. But in contrast to Zimmermann and Lembke, Herold himself had to create the bases for his search engine: He transformed information on papers into electronic data, facts were linked with addresses and were retrievable constantly. This data could be combined and compared and in this way opened new forms of criminological research, e.g. it was possible to search for „all 19 year old bakers with a Swabian dialect“.

Furthermore Herold‘s search engine became omnipresent and connected all police stations and reduced the distance between the central and the periphery, the system intelligence moved from the centre to the periphere elements. In the end the data base of the BKA was connected with international networks so that there was access to the German data from the whole world. To enable operating of the search engine the BKA implement different steps of objectifying the data. A fingerprint for example was at first captured as a photo, then it was enlarged and its characteristics were fixed as mathematical expressions and saved as a file in the data base. The idea of searching for patterns of social deviant behaviour, to take preventive actions which should substitute the search for the delinquent, was based on substantial objectifying of traces and characteristics of delinquents. Thus an attribute drifting from the norm could result in a decisive information for the police. This system depended on a giant amount on information and therefore started to stagnate because channels of information were overloaded. After describing explicitly how Herlod‘s „cybernetic police“ worked, Gugerli explains that the idea of a „cybernetic controlled, failure-free society“ failed because of the masses of information the system had to deal with. The terror of the RAF during the 1970‘s legitimized and stabilized the work of Herold‘s Engine until the resources of the system were exhausted. 

The last example that is pointed out by David Gugerli concerns the relational data bank as it has been imagined by Edgar F. Codd in 1969 and has more to do with the type of search engine we are used to. His aim was to create a data base which allowed to combine all files with each other and to investigate all kinds of possible connections between them. Codd‘s main idea was that users of future data bases do not have to possess special knowledge to use the data base. In fact it was his view that people have to be protected from depending on knowledge in regards to the internal organisation and functionality of the data in which they are interested. Until Codd‘s time hierarchical data banks had predefined ways of gaining access to the information which they had stored. Hence new kinds of questions were only possible if the user was informed about the saving-structures of the data base he or she wanted to consult. By changing this, Codd expected the users to become more specialized in asking, while the people programming the data base were assuring a reliably operating system. This gave people the opportunity to use the data bank as a black box which they could ask whatever they wanted to. Consequently, the use of the search engine changed from seeking for certain items to an open query for results. Together with his employer IBM, Codd developed the project „System R“ which was the attempt to form a data base usable even for people with less knowledge about computers. To facilitate this, they invented the „Structured English Query Language“ (SEQUEL) which enabled an easier way of querying. In mind they had the idea of a manager who needs information to take a decision independent of his knowledge about programming and data banks. This new type of search turned the computer to an important economic search engine that could be used as an instrument for rationalization. The relational data bases helped the companies to reduce transaction costs and to expand the possibilities of combining resources because it lowered the investment necessary for analysis. In Germany these ideas resulted in an alteration of the culture regarding the usage of data bases, now it was possible to query in real time and users and data were separated through a default software. 

In the end of his book Gugerli points out that western societies of the 20th century are characterized by flexibilization of expectations and the situational recombination of resources. For him, these attributes have been supported by search engines. They made it possible to locate addressable objects and increased the possibilities to access these objects. In this part Gugerli comes to the main issues of this book and he states that search engines produce overviews, determine priorities and create differences between the things they include and things they exclude. Furthermore, Gugerli gives a logic reason why search engines have a political history. It is because they contain the user‘s attention by having a certain structure of data rooms, programs and presentation of results. 

David Gugerli‘s book opens up a new view on the work of old search engines. We usually think of internet search engines like Google but he reminds us that the process of searching has been an important task in the society before the emergence of the internet. By picking the examples he demonstrates the development of search engines and successfully creates a historical room for reflections what has been his intention. The detailed descriptions of the characteristics of each search engine provided by Gugerli facilitate the understanding of how the examples functioned as search engines in their temporal and social context. The examples and explanations given by Gugerli help to consider the nowadays omnipresent Internet search engine differentiated and help to understand how search engines have become an essential base of our modern society.

Links:

This article as a PDF: Review on D. Gugerli

Biography: Inormation about D. Gugerli

More information: Resources

Share This Post