The second edition of the World Information Institute’s Deep Search conference series took place in Vienna on May 28. Where the first Deep Search symposium, held in November 2008 (find a review here) dealt with the history of information retrieval, the automatic classification of data, civil liberties, digital human rights, the power embedded in search systems and the visibility of online content, this second edition promised to look more deeply into both the history and future of classifying information, and large datasets.
Panel 1: Visions of Organizing the World
Introducing the first panel, Felix Stalder notes how the ‘grand title’ of the panel emphasizes an important issue; the urge to organize the world’s information is as old as human culture. Themes reemerge – organization cannot exist without an operating model and an array of judgments as to what constitutes information and knowledge. An historical perspective is important in this respect, as seemingly new issues are seldom unprecedented.
Chad Wellmon: Google Before Google, or, On the History of Search.
First speaker Chad Wellmon is Assistant Professor of Germanic Languages and Literature at the University of Virginia .
Wellmon starts off by quoting a New York Times article in which a Media Studies professor claims that Facebook’s unwillingness to let Google crawl part of its content threatens the open and democratic arrangement of information on the Web. To such advocates the hyperlink is no more than a ballot, an embodiment of freedom. To the individual user however the Web in its fullness does not exist. Active linking confers a structural integrity to one document, and not to another. The hyperlink method of organization may be said to be less hierarchical than categorization, but to say that the Web is democratic in nature is to ignore the means by which we access it. Search technology and linking make the Web seem smaller and more manageable than it is, and highlight its fundamentally contingent nature.
In order to gain a historical perspective on all this, Wellmon traces the history of search technology, “a story of constraint and expansion”, back to what he feels is the prototype of the Web’s hyperlink: the eighteenth century footnote. The enlightenment project is a complex of footnotes and citations, one pointing to the next. Reflexivity is in the footnote. Books ‘talked to each other’ in a constant citing process in which the relevance of one text was decided by footnotes which point toward other texts. Reading enlightenment as a series of technologies to manage the intense proliferation of information however invites the question; what kind of knowledge is deduced from this citational logic?
Using a recent computer visualization of the citation process within an eighteen century encyclopedia, Wellmon shows the emergence of multiple subsystems within the encyclopedia, exposing the double character of search technology: Citing leads to inner circling, it establishes an inside and an outside, inclusion by means of exclusion. This double logic, Wellman suggests, may well produce the distinction between information and knowledge.
It is exactly this logic that is key to Brin and Page’s 1998 paper ‘The Anatomy of a Large-Scale Hypertextual Web Search Engine’ in which they say that the Web is based on a premise of citation (linking) and annotation (link description). The practice of pointing to one another’s work, to rank and render authority is how Google works. Pagerank is a recursive system; the parameters of the system are defined by the system itself and by the history of its own operations.
But there is one key difference: Pagerank does not only count links, it normalizes them in order to increase the quality of search results. Focusing not only on comprehensiveness but also on the relevance of search results meant a step beyond the Web as envisioned by Berners-Lee – information is no longer freely available to anyone. Learning from Pagerank, Wellmon finds that radical openness is only half the story. The means of obtaining value is recursive in nature – the value of one page is a function of the value ascribed to it through links by other pages, a product of the system itself. What the Web values, is determined by what the Web already, or historically, values.
The proliferation of footnote upon footnote by scholars eventually led to ridicule and mockery. Both Kant and Descartes have noted that historic, or book knowledge is one of mere collection – one is only a repository of books and thinks in terms of them, rather than for himself. This, Wellmon feels, is the historical locus of information versus knowledge, and the reason why throughout enlightenment, encyclopedias, books and footnotes were dismissed in favor of self-originating production and recursivity.
Wellmon notes how the late eighteenth century proliferation of books prompted new systems to evaluate knowledge, and the distinction between information and knowledge became increasingly irrelevant. In a similar way, Google has undone this distinction. Is it even possible to write an algorithm that does not affect that which it retrieves? The logic of enlightenment belies the claim that Google merely retrieves and organizes information. Search is evaluative. Returning to the New York Times article, Wellmon argues that the free and democratic Web is merely an idea. We have access to a limited, contingent Web that is marked by our own searches, a function of technologies that encircle, include and exclude. Concluding his talk Wellmon notes how Brin and Page’s vision of the perfect engine that “understands exactly what I mean and gives me back exactly what I want” would represent the ultimate triumph for the citational logic; ‘what I want’ would be forever defined by ‘what I have always wanted’, or by what my demographic other has always wanted.
Yuk Hui: Critique of search and the problem of knowledge
Yuk Hui is a PhD researcher of the Metadata Project in the Centre of Cultural Studies and Department of Computing, Goldsmiths, University of London.
The title of the talk includes two questions; a critique of search, and the problem of knowledge. Hui aims to show how the two are related, and how they relate to the current discourse on search. Search in his view is a broad term that includes more than just the engine or Google – Facebook is about searching and patterning too, and sensor enabled mobile devices are also used to build coherent databases for further information retrieval.
Recently, Hui starts, we’ve seen the emergence of Web ontologies and the Semantic Web. The term ‘ontology’ originates from Aristotle and means ‘being qua being’. The term was later understood to mean the categorization or classification of things, and currently as the classification of metadata. FOAF (Friend of a Friend) is such an ontology, defining relations between people. Ontologies are used to construct a searchable network of data, and are often understood as ways toward the “organization of knowledge”.
Specifically, Hui asks what is meant by ‘knowledge’ in this respect. Are algorithms and data structures knowledge? They are definitely so to programmers, but are they knowledge to us? It seems that we lack the means to think about them – we simply use them. When we talk about acquiring knowledge through search, should we not also be talking about a knowledge of search? Do we even have room for technological objects within our understanding of knowledge?
Like Wellmon, Hui refers to Kant who distinguishes two types of knowledge; historic knowledge – the empirical kind – and rational knowledge that is obtained through reasoning and the construction of concepts. Technological knowledge (the data structures and the algorithm), he feels, is neither historical nor rational, since we merely have a general idea of what they are, and this idea was not derived through any concept we have available to us. For many philosophers, a technological object is no different than an apple, and philosophy has rendered no useful framework for us to address the technological object. We use them rather than know them.
Hui then explains how cognitive scientists Andy Clark and David Chalmers refer to technological knowledge as the ‘Extended Mind’. As the term itself suggests, the mind is completed through an extension – in Hui’s example, the mind of an Alzheimer patient is extended through a notebook which is used for looking up information on, for instance, the location of a theatre. Clark and Chalmers argue that the relation between the patient and the notebook is comparable to the relation between a healthy person and her memory. Extending this idea, Hui argues that data is in fact Kant’s ‘historical’ or ‘factual’ knowledge, and data structures and algorithms are that what constitutes ‘rational’ knowledge – data is subsumed in categories and processed by algorithms.
The technological or algorithmic knowledge seems to be located between historical and rational knowledge, modifying the transcendental nature of Kant’s faculties. Now, Hui claims, we are moving from categories as pure concepts to social categories. In the work of the sociologists Durkheim and Mauss, the understanding of social categories is not purely cognitive, they call it the “cultural character of categories of understanding”, ideas of time, space, class, number, cause, substance, personality, etcetera.
Rather than Kant’s pure reason, this framework of intelligence is social and cultural and determines the scheme of classification, the formation of concepts. Technological knowledge has a role in the creation of social categories by forcing something into cognition. Hui names Facebook invitation objects as an example, as they change the way we understand an event or a friend. Zygmunt Baumann has termed this ‘the turn from social bond to network’.
Hui then turns to the problem of knowledge: A data structure or algorithm is not self-sustainable, it needs to be externalized in order to reach perfection. However, imposing standards are limiting the possibility of changes to the core of the technology. Meanings within data structures have to be sustained, or ‘universal’, for the digital milieu to maintain integrity and compatibility. The paradoxical nature of technological knowledge problemizes the concept of the organization of (historic) knowledge, as it demands a transparent indexing method for classification.
Technological knowledge has two dimensions; the extended mind and the constitution of the social categories. The synchronization of meaning through data and algorithm is not within the scope of the “organization of knowledge” anymore, but asks questions about the constitution of an “I” (extended mind) and a “we” (social categories). If technological knowledge becomes an uncontrollable force pushing us into the process of synchronization – isn’t personalization the opposite force? It is not, as personalization is possible only through the synchronization of technological knowledge.
Many theories and sociological studies have been focusing on understanding search as the items list on the right side of the equation, but there aren’t many inquiries into the items on the left side. Doing this, does not only invite sociological, political and engineering questions such as Facebook privacy, but also existential ones. Hui emphasizes that he is not suggesting we resist new technologies, but rather that we seriously address this problem of knowledge that we have ignored in our focus on data, content or organization of knowledge.