Angela Beesley

Angela Beesley

Richard Rogers: Okay, welcome to the structures of meaning session, my name is Richard Rogers I am from the University of Amsterdam Media Studies, and the Foundation here in Amsterdam. I am joined by three experts, or actually four experts, from what you might call the space between the front-end and the back-end.

We have Angela Beesley from the Wikimedia Foundation, famed facilitators of Wikipedia, the collaboratively authored wiki-based encyclopedia and soon to be, I would imagine, the slayers of theBritannica dragon. We have Steven Pemberton, who’s on a couple ofW3C working groups, on HTML and XForms. He works at the Dutch National Research Institute for mathematics and computer science also here in Amsterdam. We have Anne Pascual and Marcus Hauer from Schoenerwissen the office for computational design based in California. Anne and Marcus also work in media arts and technology at the University of California at Santa Barbara.

Now to kick things off, I wanted to try to highlight a couple of overarching themes behind the structures of meaning on the web, at least as far as I see them, over the last decade, and them I’m going to turn it over to the speakers. After each speaker finishes I propose we can take only very very urgent questions at the time. After all three have had their turn we can open it up. Now there is one content note here, meaning note; the transcripts of all the talks, the ones that actually were not made in full, will be available on the web site

Okay: structures of meaning, I just wanted to throw up four themes. Theme one quality of information. Recently when some US congressman, when they voted to debate the certification of Bush’s election victory, some of the Representatives in House argued that the challengers were getting their information from blogs, that the challengers of the US election were getting their information from blogs. Now blogs became the latest way to fill in the old chestnut, the old idea of the web as a rumor mill. Now suddenly we can trash the web again, but is it so easy to do so and should we ignore those who question the quality of information. Secondly, theme two, relatedly: reputation. Gaining it, keeping it, managing it. Remember when we had all these little award icons, and listing of top five cool sites and things? I dragged out my collection from 1999 inspired by one of the previous speakers; this is my dead awards collection. All of these awards are completely dead and buried. Later as we move from hit counts to link counts, so hit economy to link economy, as an indicator of reputation another shift occurred, so we were no longer in charge of our own site descriptions. Metatags and things mattered less. Others were determining how we would be described.

Theme three. Information recommendation, which again is related; With such initiatives, recently, as delicious, books we like, and certain social software, we are creating information recommendation devices authored more by tribes and communities than by much larger greater conversations. So the web is become, at least in information recommendation terms, increasingly more tribal.

Theme four, the last one, semantic web. Now there always has been this massive tension between allowing the web to organize information in itself and creating classification systems that we should squeeze our information into. The question I would like to pose to the speakers as running theme is: are we tagging ourselves to death? I would like to turn it over now to Angela Beesley fromWikipedia, Angela.

Angela Beesley: Hi, I’m Angela Beesley, I’m from the WikimediaFoundation and I will tell you a little bit more about what that is in a moment here. I’m going to be talking about Wikipedia and how that’s been designed through open source principles.

To start it off, what are wikis? The first wiki was started in 1995, which was This was a site where programmers come together to basically discuss programming topics. A wiki is simply an openly editable web site, that can cover any range of topics. Where anyone can come along and create a page. You don’t need to know any programming languages to do that. Wikis have a very low learning curve so they are really open to anyone.

One of the reasons why it very easy to do is because wikis have a simplified syntax. I am going to show you an example of that. This is actually an edit screen. This is where you edit the wiki, so it has an editing toolbar on the top as well, so you don’t need to know bits of syntax, you can just click on icons. Things like the double square brackets will create a link to another page in the wiki. So simply by typing things any user can come along and create web pages straight away.

A more common feature of wikis is that all user actions are logged and are very easily reversible. So if someone makes an edit, you can easily revert it was a bad edit.

There are three terms here, which are very commonly confused. There is WikipediaWikimedia and MediaWiki. So I’m going to talk a little bit about those and tell you the differences between them. Very quickly, Wikipedia is essentially an encyclopedia, Wikimedia is the foundation that runs it, andMediaWiki is the software that it runs on.

A little bit more about the Wikimedia Foundation. This is a non-profit organization. It was founded towards the end of 2003. To basically manage Wikipedia and all its sister products which includes Wiktionary, a free content dictionary, Wiktionary was created because people started to have dictionary definitions inWikipedia, and other people would say that Wikipedia is supposed to be strictly encyclopedic, so that spun off into its own project,Wiktionary. One of our larger projects is Wikibooks, which is a collection of open source textbooks for educational use.

A very recent project is Wikinews. It is still in beta testing at the moment. It is really just a project to see whether we can apply the principles of Wikipedia to a news site, so actually people coming along and writing news stories. Wikisource is a collection of primary source documents, so it is not something that people would edit so much. They can upload texts and then can use as a source in any other projects.

Wikiquote is a collection of different quotations. The WikimediaCommons is a repository of images and other media files that are used by all the other Wikimedia projects, and can be used externally as well because all of those are under a free license.

Wikipedia itself is, of course an encyclopedia and was started four years ago. We just had our fourth birthday. In the first eight months we had eight thousand articles, which was quite a big thing, because before Wikipedia there was a project called Nupedia, which managed to create about thirty articles in two years. Nupedia had very much stricter controls over who was allowed to edit and a very controlled review process, but it turned out to be very slow. So they opened this up a little bit more and said let’s use a wiki to write Nupedia articles, and this took over very quickly and Nupedia soon died out because it just wasn’t producing articles at any sensible rate. So that was the birth of Wikipedia.

It is very international project. It started off in English but within a couple of months there was German and then French. One of the core policies of Wikipedia is that the encyclopedia is written from a neutral point of view. We try to avoid any bias and if there are two opposing points of view we describe both of those in the article rather than taking any particular side.

All the content on Wikipedia is freely licensed. It is available under the GNU Free Documentation License, so it is re-usable. People can take it, modify it, use it commercially or non-commercially. This is an example of one of the internationalization features we have. The interface and so on is completely internationalized. So this is a screen shot of the Hebrew Wikipedia. As you can see, it is using right-to-left writing, and the logo is on the other side, which would normally be on the left side. We’ve got about 100 languages, but only twenty-one of those so far have over ten thousand articles. The other is just really beginning.

Some statistics; we got 1.35 million articles in total, this is across all different languages. The English Wikipedia alone has got 450 thousand, which actually makes it larger than Britannica, andEncarta combined. The Dutch Wikipedia is just about to come up to its fifty thousand articles milestone. According to alexa.comratings, we are now in the top 100 web sites. So it’s a very big project. Over four years it’s developed well over a million articles.

This is a screen shot showing some of the larger Wikipedias. The dark blue line is the Dutch Wikipedia and the red one is the English. They are both growing at a similar rate but are different sizes at the moment.

There are a lot of social rules within Wikipedia, about how people can edit and so on. There are also some technical constraints that affect what people do on the site. For example, we can protect pages and block users. The technical parts of the software affect how people are using the site and what they can do with it.

There is also a lot of openness. The software doesn’t constrain what people can actually add. People can format pages in pretty much any way they like. There aren’t that many technical constraints. It is very much up to the users and the community to determine what can go on the site. Policies just emerge from consensus-based processes. People will suggest a policy and then, if there is agreement on it, it will just become policy. There is no top down process run by just one person. It is very much community driven.

Contributors construct their own rule space. One of the policy pages on Wikipedia says “ignore all rules”. So there is quite a lot of openness to what contributors can do. But there are certain norms, which are either enforced by the community or in some cases by community leaders. Some people are more trusted to enforce policies than other people would be.

One question that is always asked about Wikipedia is, “can the content be trusted?” I think it can. We have a lot of review process. People say it is not peer reviewed, but in many ways it is constantly being reviewed. We’ve got things like recent changes, which show every single change that is made to the content, and people are constantly checking that. We also have the watch list feature, which allows you to watch particular articles in your area of expertise. So the articles are constantly being reviewed.

One thing we are rather keen on doing is developing a stable version. So users worried about coming to an article and not knowing whether the last entry could be trusted would have the choice to view a stable version if they wanted to.

Moving on to the software that we use. This is called MediaWiki. It is used by Wikipedia and our other projects and also by a lot of external sites as well. It is one of over a hundred wiki engines; a wiki engine is just a type of collaborative software that allows people to edit in this way. MediaWiki itself was primarily developed forWikipedia. Before we were using MediaWiki we were using UseMod, but it turned out to be not scalable enough for such a large site as Wikipedia. So the wiki engine was re-written from scratch in PHP and it uses a MySQL database backend. Over time, it became more scalable than most other wiki engines. It is also use by a lot of other web sites, sites like Memory Alpha, a Star Trek wiki. There are all sorts of different content sites, not just Wikipedia, that are using MediaWiki.

Functionality includes a lot of quality control features. Versioning: we store every version of a page that was ever written. So every edit that been made is saved so you can go back and check who’s added what and when things were added.

The watch list allow people to keep an eye on certain articles and make sure they are not being vandalized.

Another feature is the organization of name-spaces and categories. Name-spaces keep a discussion of content separate from the content itself. So if there is a dispute on an article, that would happen on a talk page rather than the article itself. The categories are very user driven; the software doesn’t impose any particular category structure. Any user can come along and create a new category and that hierarchy has been built up over the last six months, since the feature was introduced. So it provides a rather different approach to categorization rather than being handed out in advance. It is very much being built up over time.

There is also administrative functionality. Pages can be protected. Usually this is done if a particular page is being vandalized a lot. It will be temporarily protected until that is sorted out. If there is dispute over an article, it will be protected to force people to discuss it rather than edit war over it.

Blocking of users or IP addresses is a function of the software as well. So if a particular person is being a problem they can just be blocked from editing completely.

There are a lot of extensions, which makes it quite different from some other wiki engines, allowing math tags for mathematical formulas and so on. There are a lot of different types of formatting that people can add very easily through these extensions. Even hieroglyphics can be added to the wiki. The wiki interface itself is available in over 30 languages.

This screen shot shows a version comparison. This is page history, so every time someone makes an edit you can go in see exactly what’s changed, with the old version on the left and the new one on the right. And the words that have been added have been highlighted in red.

(Technical difficulty)

Richard Rogers: Hi, whilst we’re waiting, I am just wondering if there might be any questions from the audience?

Steven Pemberton: Yes, Steven Pemberton. So, the web was actually originally designed for exactly this sort of thing. I mean Tim Berners-Lee was doing exactly this sort of thing with his first browser. It is because almost all servers don’t support the “PUT” part of the HTTP protocol that you have to make all of this extra software. Don’t you think it is a shame that you have to do that, when it is actually already built into the technology and shouldn’t you be pushing for servers who actually did it right rather than this approach?

Angela Beesley: I don’t know, there is a lot more to the software than people simply uploading there text to the web. There is so much functionality I don’t think you can do that without having specialized software designed for this purpose.

Steven Pemberton: But it would allow you to do, for instance, WYSIWYG editing, and then you just do save, and just like it ought to have done it just goes back to the web server. I mean that… for instance, the W3C site is exactly like that, it’s a wiki. Everyone on the team is allowed to edit any page, but we just use WYSIWYG software and we just hit save and it sends it back to the web.

Angela Beesley: I mean at the moment what we’ve got in terms of WYSIWYG is simple writing extensions that will allow that. So yeah, it’s a shame that is just can’t do that without having the extension of the software.

David Garcia: Maybe I’m pre-empting something you will come to later in your talk. There have been a few articles recently about a split in the Wikipedia community. Some people wanting to re-instate some of the more hierarchical protocols for quality control, are you going to comment on that in the rest of your talk, or if not could you comment on it now?

Angela Beesley: Yeah, there is a little bit of a split, but really the foundation is trying to address both sides of that. To see if we could come to some sort of compromise, this is what the part “stable versions” was about. It will give people the chance to mark an article as being authoritative and you would have these hierarchies saying yes this article is now authoritative and so on. But the live site itself will still be editable by anyone. So it is all mixed and the best of both worlds. Then people have a choice, if they need stable version they can go to that but they can still edit the live one, too.

David Garcia: So the things like the Wired Online article kind of hinting at a big division, you feel that’s been resolved by the process you’ve indicated, too.

Angela Beesley: I think it will be resolved; it is not actually resolved yet. And we are still discussing how to resolve it.

Geert Lovink: Could you give a bit of context for those of us who do not know what caused this split or debate?

Angela Beesley: I think part of it is because as Wikipedia got bigger, people began to rely on it more and more as a source. It is no longer just a project to create an encyclopedia; it really is an encyclopedia that people are using. So the need for things to be trustworthy really increased just recently since it’s hit the media so much. I think that is a part of what it is about, people just getting really worried about if they can trust it.

Geert Lovink: (?)

Angela Beesley: I don’t think there is one particular instance, no.

(Technical difficulty end)

Angela Beesley: Okay, looking at some of these features again, quality control, I mentioned recent changes already. There’s also the page history, which I should you a screen shot of, so anyone can check exactly what’s been changed. If we’ve got a problem user, if we find a particular user adding bias or writing something that is not factual, every user has a list of all of their contributions. So if you find a problem you can go back through their contribution list and check their edits and make sure those are all right.

Besides the content on Wikipedia and its sister projects, there is also a large community around behind that, which is something that people just coming to the web site and reading it will often don’t notice. It is a very strong community. In anyone month there can be as many twelve thousand different editors. So there are a lot of community features to try and keep those people together. That includes talk pages; every article has a talk page attached to it. So if anyone wants to discuss a problem with the article they can go there.

Every user can put up a profile page about themselves, as well. So people are wondering if they can trust a person they can always go to the profile and see what sort of person is editing it.

There are different access levels as well. People can apply to become administrators, which gives them a few more technical features they have access to, such as page protection or blocking of users. There are also automatic user-to-user emails to try and encourage people to communicate more. And there is automatic message notification. You can leave a message on the wiki for someone and they will be notified of that next time they come online.

As I said, all of Wikipedia is free content. All Wikimedia content is under the GNU Free Documentation License and the software is under the GNU General Public License. This means it can be freely distributed and modified. Another thing the license means is that authors have to be attributed by anyone using the content, which encourages people to contribute because they know if their work will be used, then they will be attributed. The way Wikipedia does this is through the page history. If you click history on the top of any article you can see a list of who has contributed to it. The license means it also remains non-proprietary. Though people can use it commercially they can’t lock it down, they would have to keep it under the same licenses. So any re-use of or any modification will still be under “GFDL”, so Wikipedia can then re-use any improvements that people make to it. One advantage of this is that it increases a sense of shared ownership, because no one person owns an article. Anyone can come and edit and modify it at any time. So it really belongs to the project rather than any particular user. This decreases some problems we might otherwise have if some person wrote an article and they wanted it to stay in a particular way.

The web design of the site is split between developers and users. Developers create the skin framework and then it is very much up to the user to customize that and tailor it to particular projects. So for example, the English Wikipedia might not have the exact same design as the Dutch Wikipedia. Developers with provide default skins and they also create extension tools that help the users with editing the site. The users can format the site content in anyway they want. They have access to certain HTML tags such as tables, so they can format things in that way if they want to. Different projects would have different formatting. So all Wikipedia articles will not be formatted in exactly the same way as textbooks or Wikibooks for example, it is very much up to the users. And users can create user style sheets and edit the interface. They can also create new skins, which happens quite rarely.

Right here is an example of a user created skin. Some one has completely changed the design. It doesn’t look like the typicalWikipedia design. This feature is often used by other sites using theMediaWiki software. They don’t want their site to be confused byWikipedia; they will make a completely different design like this.

Users can create individual style sheets, so you can have Wikipedialooking a particular way for you. There is a problem with this in that users can’t always be trusted in the design things very well. This is a screen shot of my skin of Wikipedia. This is how I see it. The one on the left is how it looks before you scroll, and the one on the right is after you scroll the page, which is just a complete jumble of links, rather unreadable. So that is one of the downsides of letting users do this, is that they don’t really know what they are doing. They end up with a screen looking like that. Here are some screen shots of the largest Wikipedia; English, German, and Japanese. It is completely up to the users how they decide to format their pages. This is the front page you see if you go to any particular language version. But what often happens is that they share a lot of similarities between them. So one site would make their main page and then the good parts of that will filter out into the other languages. You really end up with parts of the design that work well. An example of this is parts like the “featured article”, which is showing on all six of the largest Wikipedia. You can see that they have all copied the same sort of table format with the different colors and different areas. There are also a lot of similarities, which are forced on them, in terms of the navigation. The logo is always in the same place the standard links are all there on the left. So certain parts the user can’t change are given by the developers. But the rest of the page is completely up to the users to come along and edit.

There are a lot of external tools that people create. Because the software is free content, it is very easy for people to modify that and add on extensions. There are a few offline editions that people have been working on. One of them is WikiWriter. There is a screen shot of this here, which gives a more WYSIWYG approach to editing, for people who prefer that to editing directly on the site. This one actually splits off metadata, you can see the inter-language links and the categories are appearing in different columns. There are also various plugins for different text editors and a plugin for Firefox. If you use the Firefox browser, you can download this plugin and editWikipedia directly just from the right click button.

And users create markups and conversion tools, which convert HTML to wikitext as well. This Tombraider screen shot is actually an extension, which allows Wikipedia to be read on PDAs.

I’m just going to show you some screen shots of the last four years ofWikipedia, which shows a history of the site from being very content centric to becoming more user centric. And there is a tendency over the last four years to move to more dynamic information for like things that are in use and things that are updated daily rather than vary rarely.

More visual elements, more color have been added over the last four years. And there has also been a separation between editors and readers; before there was just the main page for everybody, now there’s a main page for readers, and there is a separate community main page. This is August 2001, just eight months after we started, when we were still using the UseMod software. So it is very basic, just text, there were no images at this time at all. It is just basically a list of main topic areas we had at the time. Then, in November 2002, we move to what we called phase three. This is MediaWikibefore it had a name. Again it is very text centric, but you have more navigational feature now. You got links to older versions and so on. Jumping to 2003, it hasn’t changed all that much but we’ve moved into a table format now. We split off the articles along with the community aspect under the “writing articles” heading, and you’ve got links to the community and you have policy and help pages and that sort of thing, now on our main page. In February of 2004 we had a logo competition, so we got the new Wikipedia logo. And it is also the first time we had color on the main page. It still isn’t very colorful but it is a little bit different than it was before. And then in 2005, this was the main page a couple of days ago. So it is still fairly similar. We have the same table but we have far more dynamic information, like the featured article is changed every day. Rather than just listing all the main categories, readers have to click the browse link and they can go to those categories, but the main page is used for things that are regularly updated. So the main page now is updated on a daily basis rather than more rarely.

Quickly to come to a conclusion, by empowering users like we’ve done on Wikipedia, they can fix their own design problems. Over the last four years, Wikipedia has opened up its design process so it is no longer just up to the developer of the site; through the use of style sheet and so on, it has become much more open. There are feedback mechanisms, the way the community feeds back to the developers, such as various pages on the wiki for people who have a problem, and a Bugzilla bug tracking system. People don’t need the knowledge to fix the bugs themselves because through these feedback mechanisms, they can report them to other people who can fix them.

Even beyond wikis, this open feedback management can go a long way. Its not only can be applied to wikis, in terms of open design, other sites could take this up and give it to the user to help with the design aspects.