Dirk Lewandowski: Why We Need an Independent Index of the Web

Short interview with Dirk Lewandowski from network cultures on Vimeo.

How can we create real alternative search engines? German professor Dirk Lewandowski spoke as third speaker in the session ‘Google Domination’. He argues that we need an independent index of the web. “We don’t need publicly funded search engines. Instead, we need publicly funded search index.” Why? He argues that with an index we can do much more than just web search.

A search engine index collects, parses, and stores data to facilitate fast and accurate information retrieval. It is a local copy of the web; sarch engines create direct replicas of documents. This representation includes more information than just the text: information about the author, the length, title, keywords, decay, date, pagerank etc. are also stored. The representation of a website on a search engine does not always match the original page and Google’s copy is often lacking newly added information. It is impossible to be always up-to date, yet a local and up to date copy of the web is the ‘holy grail’ to create alternative search engines. However, this is easy established.

At this moment Google dominates the market and holds 90% of search requests. Therefore, users rely on Google’s networks of ordering results and for Google’s method of collecting data. This is problematic since Google is a corporate company and the way the engine works is not transparent or clear to it’s users. Despite this, there has been no real alternative for Google; even big companies like Microsoft struggle in establishing its own web search engine. Moreover, the alternatives that do exist are often simply other user interfaces to Google.

There are some alternative search engines, like start-up DuckDuckGo. They use the Partner model: “Real” search engines providers like Google or Bing operate their own search engines but also provide their search results to providers. They can receive income by ads and revenue sharing. All the major web portals have now embraced this model. This model thins out competition in the search engine industry.

Lewandowski sees four area’s of classification in alternative search engines:

  1. All those alternatives that are are not Google. He also calls this Google Killers.
  2. Alternatives that are not perceived as alternatives because they almost show the same results as Google. For the users there is no need to switch to this engine because they are not radically different, e.g. Bing.
  3. Engines with an explicit position of alternative to Google, e.g. Skeekport.
  4. New approaches to search, or “real alternatives”. Unfortunately they all have in common that they don’t play a role in the market share.
Society of the Query #2

What is a good idea when it comes to alternatives then?  A single, collaborative European alternative search engine is a bad idea according to Lewandowski. He is afraid this would fail. It is tricky to make only one big alternative. A single element could be unappreciated or not functioning, like an unappealing design, and the whole project would fail. This is problematic since the building of a new index is costly and there are hardly any candidates with the natural resources to fund this. Hence, there must be a way to enable multiple alternative search engines so the money is not lost.

As a solution, Lewandowski says we need to focus on building an alternative index that provides us with multiple options for search engines. Users should have the choice between different worldviews, which originate as a product of algorithm-based search result generation. And with multiple views, Lewandowski doesn’t just mean 3 or 4. “We need to create the conditions that make it possible for individuals, companies and institutions to create their own search engine.” Fair competition would be the result.

The project should be an index of the web that can be accessed under fair conditions by everyone for low cost. For larger amounts of data the user will have to pay. These search engines do not have to be only web wide engines, e.g. libraries could do a lot with the data of the web. There are a lot of advantages to an independent index machine. For example, it motivates companies to create their own search applications and we can go way beyond search, and perform analytics on web data.

However, this project needs a lot of funding and cannot be supported by one country alone. Who would operate and fund the index? Lewandowski imagines it should become a pan European initiative. The question now is: who will operate this?

Society of the Query #2 – Dirk Lewandowski: Why We Need an Independent Index of the Web from network cultures on Vimeo.