Of bots and conlangs: the Volapük Wikipedia


“Vükiped”: logo of
the Volapük Wikipedia

If you are after some good wikidrama reading as you settle in for 2008, it’s hard to go past the current Volapük Wikipedia. This tale is a potent combination of machine translation, bots, minor constructed languages, language advocacy and statistics. At heart it is a tussle over the answers to the questions, “What is Wikipedia?” and “Why do we create Wikipedias?”

I first became aware of the Volapük Wikipedia (vo.wp) in October when I was doing some planning for the Commons Picture of the Year competition, deciding which languages I should push as a priority. I looked at the meta page List of Wikipedias and found there was 15 Wikipedias with over 100,000 articles. That seemed like a neat cut-off point, and so I made my list.

Except, the 15th one was “Volapük”, and I felt more than a little embarrassed that I had never heard of this language before, because I love languages and linguistics…looking further along that table revealed vo.wp had only 5 admins and 250 users… that was a tenth or less the size compared to the others in the top 15 (compared proportionally). What were they doing?

At that time, SmeiraBot had made over 3/4 of the total edits on the entire wiki. So the disproportional growth was thanks to bots.

A month or so beforehand, someone had had some similar realisations to me, and made a proposal to close vo.wp. I commented on that proposal in favour of deleting the vast majority of the bot generated articles. In brief, Smeira’s actions offended my feeling of what Wikipedia was, because there would never be a community to maintain 100,000 articles in this language. Is Wikipedia just a free content encyclopedia, or is it an free content encyclopedia written and maintained by a community? That proposal ended up being closed as Keep. Despite all the heat and light, I doubt many of the commenters actually wanted the entire thing deleted.

Then on Christmas Day, Arnomane made a proposal for a Radical cleanup of Volapük Wikipedia. His proposal was not to close the project but just delete the vast majority of the bot articles. That set off a lengthy thread on foundation-l called A dangerous precedent which is still ongoing.

There are two red herrings that have been floating about in this debate. The first, if people are opposed to this bot bomb then they are opposed to all bot-generated articles. Of course not. Bots have a time and place. Seeding new wikis is certainly a very useful function of bots. But “seeding” provokes the idea that people will be around, a community, to tend to the articles after that. This was a seeding for a wiki bigger than the Romanian Wikipedia. Romanian has 28 million first- or second-language speakers. 28 million people to potentially tend to ro.wp’s 98 736 articles. Volapük has 20. Twenty. Total. vo.wp’s bot generated content is hugely out of proportion to the reality of its speakers.

Why do we create Wikipedias? This is where the “language ego” must come in. I don’t know the right term for it but I’m sure there is one… People want to create a Wikipedia, an encyclopedia, when they feel that their language is one worthy of communicating written knowledge. That is part of the reason why people get so hot under the collar when they get even a hint of a suggestion that someone has said a minority language does not deserve some X the same as other, larger languages. Linguistic rights belong to speakers of natural languages, I think, not constructed languages. If you want to disagree on that point, then OK, but they should definitely not just be swept together as “minority languages” of equal cultural and historical importance to the human race.

Is it OK for Wikipedia to be used as a conlang-promotional experiment if it is shaped like an free content encyclopedia, even one that is virtually doomed to permanent poor quality? That’s not a trick question…

31 December, 2007 • , ,

Comment

1

Brianna, questions to be asked about Volapuk Wikipedia are very simple:

1. Do we want to have conlangs?
2. Do we allow bots to add content? (generally, I am not asking for the rules on a particular project)
3. Does their action makes technical problems to the rest of the projects?

Yes-Yes-No answers are giving right to Smeira to do so.

If first question is “no” (actually, my opinion here is “weak yes”) — it affects other conlangs, too.

Second “no” affects a lot of other projects.

If third answer is “yes”, of course, it should be stopped, but correspondence between project’s usefulness and resource-eatingness is almost 1:1 when we are talking about WM projects. The most resource-eating project is English Wikipedia and it is the most useful one.

All other “yes, but” or “no, but” are expressing “how dare you?” or “I want my to be bigger then yours”, which are emotional and irrational positions.

Our rules have to be the same for everyone; we shouldn’t make rules and broke them if we like or don’t like something. This is the most important thing at the Wikimedian way of becoming a mature society.

(BTW, I remember to look at the comments when you write another post. Is there a way for being subscribed?)

Milos Rancic · 31. December 2007, 19:54

2

Briana,
In general, articles on wikipedia are deleted according to the policies, which have been created in line with the wikimedia mission and implement the specifics of the five pillars and the foundation issues.

Articles can be proposed for deletion if their topics are out of the scope of wikipedia, if they contain libelious or illegal materials, or (in some wikipedia) even if they contain inaccurate information, ….

Now I have a question.
Which particular criteria and policies are we using?

H.

H. · 1. January 2008, 01:02

3

What I see in the debate on the Volapük Wikipedia is a problem where people want to find some rules (or a “bright line”, as one person said) to apply in this case, so we don’t have to make an arbitrary decision on this specific Wikipedia — & they’re not finding it.

My specific concern in this matter is that this should not be used as a precedent for deleting bot-created articles, having “sorta” created a few hundred in that way. (I say “sorta” with the quotation marks, because a more precise explanation would be distracting & unhelpful.) And I feel that Smeira’s motivation in this matter is also irrelevant; people do good things for the wrong reasons all of the time.

The basis any decision should be made on is whether there is a viable community that wants to create an encyclopedia in Volapük. If there is, I have no problem with it existing with a 100,000 articles; if it does not (or it consists simply of one devoted person & an uncertain stream of casual contributions), then something should be done.

Unfortuntately, I believe the argument about the Volapük Wikipedia is being fought over all of the wrong reasons, & unless we move to better ones, any decision will only serve to hurt the Wikipedias as a whole, not help them.

Geoff

llywrch · 1. January 2008, 05:14

4

To Milos: Hmm, I think you are not responding to what I wrote and instead reframing the debate in your own terms, since I specifically said I didn’t support closing vo.wp and I pointed out that you can support bot generated articles without supporting everything they do (and likewise oppose).

Our rules have to be the same for everyone; we shouldn’t make rules and broke them if we like or don’t like something. This is the most important thing at the Wikimedian way of becoming a mature society.

So you don’t subscribe to IAR, then? :)

You can subscribe to the feed for all the blog comments here but I don’t currently know of a way to just follow comments on a single post. hmm… I will investigate.

pfctdayelise · 1. January 2008, 19:20

5

To H,

All Wikipedias have different notions of what is proper to include in an encyclopedia. ja.wp reportedly has extensive entries on bus stops. en.wp has extensive entries on Pokemon characters and pop culture topics. From what I understand neither of these situations would be accepted at de.wp. I don't think it is so radical that a project may delete content that is low quality.

In most cases this would not happen, because Wikipedians recognise that the quality can improve, and low quality can even be a good prompt to expand or improve material. But the reality of vo.wp is that this will not happen, even with the most optimistic projections of learners and participants, there is no conceivable way vo.wp could grow a community big enough to tend to 100,000 articles.

pfctdayelise · 1. January 2008, 19:26

6

Geoff,

You are right, people discuss this issue in terms of their own favourite hot-button issue, which is why I don’t see that consensus will be forthcoming any time soon.

pfctdayelise · 1. January 2008, 19:56

7

Brianna,

Let me clarify: My question is:

What criteria and policies should we use, in order to decide which articles in the Vukiped are to be deleted?

Best,
H.

(Sorry for twice-submitting; if only I could correct my posts as on wiki!)

H. · 3. January 2008, 01:39

8

Dear Brianna,

I agree with several of the points made by previous commentators above. I will only add the following:

(a) I think the bot-created stub question should be discussed at a higher level. Some controversial ideas I suggest: bot-created stubs, even thousands of them, even up to 95% of all articles, do not harm or prevent human work and cooperation; they simply add to it. Most of the opposition against Volapük bot-stubs seems to stem from the belief that it somehow belittles the hard work human editors put into fewer — but better — articles elsewhere. It doesn’t. There is no reason why it should, just like there is no reason to consider a Wikipedia who created twenty stubs as ‘better’ than one who created one Featured Article.

(b) There is a suggestion that ‘small communities shouldn’t do many articles’. This is based, I think, on the misunderstanding that number-of-articles measures quality. It doesn’t. Just like you can’t measure the area of a rectangle by the length of one of its sides. If this parameter is given up, what objection is there to stubs in any number? Wikipedia is not the Olympic Games, nor is it about who’s bigger. Work quality is not measured by such simple numbers.

I have given what is, I hope, a more detailed exposition of these points and others at Arnomane’s blog (link: http://arnomane.wordpress.com/2007/12/28/the-bot-equivalent-to-the-atom-bomb-was-ignited/ ).

All the best,

Smeira · 3. January 2008, 09:04

9

To H@7: Why not start with any that haven’t been edited by non-bots, and then the shortest ones? I think I missed your point.

pfctdayelise · 4. January 2008, 01:46

10

Brianna,
Look, I am looking for policies and criteria; what you gave me is an action whose justification is exactly what I am seeking.

So, is this action required by existing wikimpedia policies? Is it consistent with the wikimedia mission? Or is it necessitated by the survival needs of the wikipedia project?

Best,
H.

H. · 4. January 2008, 12:02

11

H, AFAIK there is no existing policy that contradicts Smeira’s actions.

The mission statement says The mission of the Wikimedia Foundation is to empower and engage people around the world to collect and develop educational content…
.

I think there’s a strong argument vo.wp is not currently doing that.

pfctdayelise · 5. January 2008, 15:32

12

Brianna, That is an interesting point of view.
Which people do you think the Wikipedia in Volapuk should seek to engage first?

Best,
H.

H. · 6. January 2008, 08:07

Elsewhere on the web...

Commenting is closed for this article.

list of all posts, ever

find articles by tag

monthly archive

most popular articles

  1. [guest] Rethinking the Top Ten
  2. How to use Gmail to manage high-traffic mailing lists
  3. An alternative term for "User-generated content"
  4. NLA Innovative Ideas Forum audio/video now available
  5. Write API enabled on Wikimedia sites!
  6. Top 10 software extensions Wikimedia Commons needs in 2008
  7. Is mass collaboration all it's cracked up to be?
  8. GLAM-WIKI, day one
  9. Free MediaWiki hosting offered by Dreamhost Apps
  10. Reflections on PGIP phase 1

(from the last 30 days)