Ortega: Wikpedia’s Self-Regulating Patterns in Open Numbers

CPoV Wikipedia Conference

Developing open source software Ortega xeroxed the ten top language Wikpedia sites to present us with an impressive quanititave corpus of data. In his presentation he cast a critical eye in the developement and publication of quantitative data in on-line research worldwide, calling for more open practices. He pointed to the lack of comparative studies,  the need for open data to assist global comparisons. Indeed, Ortega experienced the lack of a worldwide perspectives in quantitative studies at first hand when he started his research,  as most data was not avaliable in the public domain or didn’t use open software, or even worse used categories that made comparisons impossible. His work is to a large extent a reaction to this lack.

Ortega created wikixray- the ultimate open wiki machine, instead of using of the shelf software. Wikixray is  now made avaliable to reseachers worldwide, together with the pull of data findings of his research. Ortega was eager to note that the software is easy to use on any wiki website.

In his research design,  Ortega, decided to include some open questions such as “is Wikipedia a sustainable project” or “what type of parameter affect Wikipedia” to analyse somewhat 7 terra bytes of content, that is the 10 most popular language wikipedia sites! Ortega found there are 4,805,713 registered editors in the top ten languages Wikipedias. These users use Wikipedia at least 346.9 days in time, something like 141,6 in average.

His analysis shows that in all language versions growth follows an exponential growth patern, i.e.  it starts slowly and then accelarates. This is particularly surprising in the light of the difference in the number of contributors. The same pattern repeats in creation of pages in all ten languages. For Ortega these patterns point to a key question: Does Wikipedia reach a maturity stage were activity stops progressing, and if this is so why cant it grow? Ortega mentioned that in answering this question the media have interpreted his data in opossite ways!

Ortega also compared tiny vs standards articles. For example in the english version 80% of pages are talk pages, in the polish Wikpedia there are no talk pages.

With regard to the sustainability issue Ortega was keen to show that the number of edits by people has remained stable since 2007. He also briefly pointed to the Wikipedia general survey of 130,576 poeple, which showed that 65% of users are readers, 10% are regular contributors ( 50% of answers came from russia), and only 13% are women. He was carefull, however, to point to the fact that the survey does not sample users and therefore is limited in terms of how one can interpret the results.

Ortega also noted the inequality of contributions amongst editors. For example 5% of authors accounts for more than 90% of total number of revisions. Finaly Ortega showed that 4 years ago the inequality in distribution reached a plato and has been equal each month wordwide since then.

In Ortegas view in order for Wikipedia to remain sustainable better ways to use Wikipedia in education need to be carved.  Furthermore ways to improve the interphase and the reviewing proccesss are needed. Together these can be used for improving their user experience overal. Ortega argued that Wikipedia needs better community building and maintance tools. Furthermore that Wikipedia needs to exploit the power of academia.

March 26-27, 2010. 2nd CPOV: Wikipedia Conference. Institute of Network Cultures, Amsterdam. [slides]

More information about him: