GLAM-WIKI - August 6-7, Canberra
GLAM-WIKI: Galleries, Libraries, Archives, Museums & Wikimedia: Finding the common ground August 6-7 2009, Canberra
See http://glam.wikimedia.org.au/ for more information. Attendance is FREE but register early… places are limited!

☍ Links for 2009-06-23
- Mako is keynoting LCA (January, Wellington). WOOT.
- There was a conference in Portland recently called Open Source Bridge and it looks like it was really freaking cool.
- Really cool looking event in Canberra this week, courtesy of Senator Lundy — Public Sphere 2 – Government 2.0 Includes the creation of a Government 2.0 Taskforce which will provide some advice and even some funding! From the event itself, a wiki-based outcomes document is yet to surface.
- The P2P Foundation blog is publishing some interview with Wikipedians & ex-Wikipedians this week:
- Michel Bauwens and Axel Bruns (Bauwens is AFAIK the main person behind the P2P Foundation. Bruns is an author and also keeps a blog ; he’s rather fond of the term produsage)
- ‘Cedric’ and Barry Kort (‘Cedric’ is from Wikipedia Review; Barry Kort is from MIT, and was involved in drama at en.wp as User:Moulton and has written a ‘Knol’ called The governance model of Wikipedia)
- Wikimania sneak peek! Since no announcement has been made, I’m not sure if we’re supposed to be able to see these yet or not…
- There’s an Open Education Conference being held in Canada during August 12-14. The speakers look pretty diverse, so if you’re interested in attending, check out how to apply for one of their travel scholarships.
Well it is conference season…

Wikisource at a law conference and other ☍ links for 2009-06-20
Via Open Access News: a chap called Tim Armstrong at a conference for law school computing called Crowdsourcing and Open Access v2.0: Harnessing the Power of Peer Production to Disseminate Historical Records and Legal Scholarship:
This presentation expands the inquiry [of “[enlisting] anonymous collaborators online to help make legal research materials freely available”] to consider whether crowdsourcing tools can aid in the dissemination of historical records and, of particular interest to law faculty, legal scholarship.
[…] I will use two examples drawn from Wikisource, an open-access library of public domain (or freely licensed) works, to illuminate the power of “crowdsourced” efforts to archive and distribute historical and scholarly works. First, I will highlight the efforts of the Wikisource community to digitize, and make available in full text, the earliest volume of the United States Statutes at Large, a work not freely available anywhere else online. Second, by way of “walking the talk,” I will discuss my recent experiment in disseminating my own legal scholarship by the same means, yielding a product that seems superior in a number of respects to more familiar large-scale scholarly repositories such as SSRN.
Neat, eh? Slides are also available. And Tim also put up one of his own papers that he licensed under CC-BY-SA — it’s called “Fair Circumvention” and you can check it out as a PDF or as a Wikisource document or of course in a side by side comparison. Tim is also an admin on Wikisource.
Wikisource bills itself as an “online library of free content publications”, but that seems to me to be a vast understatement that doesn’t capture what’s special about it.
Wikisource, as far as I know (which is not very far, and I will happily accept corrections here), relies heavily on the file format Djvu (pronounced “deja vu”) and a MediaWiki extension called Proofread Page. “DjVu is a computer file format designed primarily to store scanned images, especially those containing text and line drawings. It features advanced technologies such as image layer separation of text and background/images, progressive loading, arithmetic coding, and lossy compression for bitonal images. This allows for high quality, readable images to be stored in a minimum of space, so that they can be made available on the web.” (So reports this example — Alice in Wonderland.) So Djvu is kind of like a version of PDF that’s been uber-enhanced for scanned text.
English Wikisource seems to lack a help page that explains its basic operations in a single page. Especially with screenshots. Or did I miss it?
Peter Suber pointed out the similarity between this idea and Open Medicine’s idea of simultaneously publishing articles in HTML and “wiki” (previously mentioned on this blog), but I think that is slightly different, as I believe Open Medicine intended to encourage further collaboration on the work, whereas Wikisource transcribes PDFs, but with the intention of staying faithful to the original. If you want to keep editing it, perhaps it’s time to move it to Wikibooks/Wikisource?
I like the idea of using a wiki as a repository, whether or not you intend to allow further editing, but I’m just concerned that MediaWiki syntax is not standardised and you get just getting locked in to another platform. Template proliferation may be another problem.
And, elsewhere:
- File sharing has not discouraged creativity. This will be no surprise to many people, including Julie Cohen, who spoke memorably at the Copyright Future: Copyright Freedom conference about “copyright & creativity”.
- The Open Video conference is on at the moment in New York. Of course, don’t worry if you can’t make it, because there will definitely be tons of video. :) The schedule looks interesting.
- The Global Watchtower blog (‘Globablization in Practice’) has written on LinkedIn’s mishandled attempt to ‘crowdsource’ translations. Essentially they emailed every LinkedIn user who had a word like ‘language’ or ‘linguist’ or ‘translator’ in their profile, and asked them to fill out a survey saying if they’d like to do translation for LinkedIn for free. Unsurprisingly that didn’t go down that well. However it’s great to see Global Watchtower present a nuanced understanding of what they call “CT3” (“community, crowdsourced, and collaborative translation”). I highly recommend this blog for anyone interested in developments in commercial language technology (especially translation technology news).

Charles Matthews: What did we learn from "Matthew Hoffman"?
This post is by Charles Matthews. Charles was a member of the English Wikipedia ArbCom from 2006 to 2008. His first guest post was On Notability. —Brianna
Some ArbCom (Arbitration Committee) cases on the English Wikipedia can reach the mainstream media: there was a recent decision on Scientology-related editing which did just that. Others are very much for insiders, and the innocuously-named Matthew Hoffman case, the topic of a recent ArbCom statement, is an example. I brought the case, a year and a half ago. This will be part retrospect, and part a meditation on “ArbCom 2009”.
What did we learn, then? The short answer is “not enough”. ArbCom 2009 has come to the view that the case should never have been accepted. I don’t think I’ll hire them as historians: the decision they have recently issued about the case is much the same as saying that in 2009 the case would not have been taken, and if taken would have been handled very differently. I’m not quarrelling with that conclusion since it is probably simply true, and it is well within ArbCom’s remit to reconsider matters and the way they were dealt with in the past. What catches my eye there is that justice was always an issue in the Hoffman case, since User:Matthew Hoffman was permanently banned by two admins on no evidence at all. That is one point, and the new statement changes nothing about it. And the other is that Wikipedia is a dynamic place. ArbCom 2009 is not ArbCom 2007 which accepted the case – only a couple of those Arbitrators are still there – and the whole context changes, particularly since ArbCom is an elected body. Elections also matter in this story, since both admins in the frame ran in the 2007 elections that could have put them on ArbCom 2008, and the case was concurrent with the election period.
The Matthew Hoffman case was brought by me because I thought the ArbCom (of which I was a member 2006-8) should look at how it could happen that two admins at the Adminstrators Noticeboard (AN) could decide on the flimsiest of grounds that the Matthew Hoffman account was a sockpuppet (of some other unspecified account), never think to ask for a CheckUser run to verify this and see what other accounts were involved, and one of them (SH as I shall call him) block the account permanently, with a misleading log entry saying “vandalism-only”. Now, in the light of the Scientology decision, the rationale on the admins’ side can be clarified this way: the class of ‘single-purpose accounts’ (SPAs) brings itself under suspicion, because an SPA edits just in one area. When (as for much Scientology-related editing) there is reason to believe that the editing of a group of SPAs is centrally organized, then worries increase. This argument was brought up in the Hoffman case, with creationism in the place of Scientology. The ArbCom of the time took little notice of this line of reasoning (rightly, in my view). It is still no crime to be an SPA, though it will in practical terms tend to tell against an editor in dispute resolution. Note the distinction, though: Hoffman was blocked by admins not trying to resolve a dispute, because the AN discussion of his case took place while he was blocked for 72 hours. That’s the key problem here with natural justice. Hoffman was locked out of responding on the site to the sockpuppet claim by a short block. (ArbCom found that while the Hoffman account was an SPA, there was no evidence at all that it was a sock. Suspicion is not evidence, but it plays a part in how matters are handled administratively on the site, so that justice is not always served.)
Someone else, before I got there, had put it to SH that the block should be reconsidered, only to be told that “sorry, it was consensus at AN”. Here’s another thing we learned, namely two admins on a noticeboard (meaning an unregulated onsite process) can decide to block someone indefinitely, on no evidence, and then fend off outside interest. That was as of 2007, and I don’t suppose the same uncritical attitude would pass muster now. It took some months for the matter to get to court, and I’ll not rehearse the whole history. The fact is that SH’s block was his personal responsibility, and was so treated by ArbCom when it took the case, which brought forth little general illumination beyond the SPA argument I have mentioned. It was shoehorned into being a case about SH; I (naturally) was recused, and this was not the inquiry I had wanted, but it was all out of my control. For more on the facts see my only extensive onsite discussion ; the matter is in the first two questions, but the joint statement in the blue box at the top of the page explains why I’m not going to cover this ground again, and indeed stopped short then.
I was outraged by the whole business: a culture of admins being unreasonable rather than responsive in this matter just created a fall guy. Let’s hope that has changed. How should it all work, in the big picture? My view: admins should be granted plenty of discretion in using their powers to defend Wikipedia’s content and mission. But admins who make poor discretionary decisions should expect to have to defend those decisions rationally when challenged; and failure to engage and make an acceptable case is a serious question mark over the admin. It’s not the mistake (we all make them), but the attitude to discussing the decisions that make up the admin workload. The admin community is in potential conflict with the small ArbCom (of about 1% of the size of the admin body) that can remove their powers. Some other Wikipedias do without an arbitration process, and so the justice mechanism is the admin body and its self-regulation; but self-regulation can be flawed, too. ArbCom can review ‘community bans’, namely bans upheld by all admins, but this kind of review now rarely causes trouble and it is unusual for a community ban appeal to succeed; this path isn’t really controversial.
The dispute that arose could certainly have been avoided by applying the maxim “thoughtful, not combative”. It was disastrous (all round) that a block discussed briefly at AN was confused with a community ban, with so much muddle. Was Hoffman a vandal, a sock, or a disruptive editor, and did anyone care which? None of the above: it was a bad block being covered up. Perfunctory discussion at AN must not be held up as deciding these matters once and for all. Why would it not have been important at least to know of what other account the Matthew Hoffman account was a sock? Why was he run off the site before being asked whether it was a real name? Those questions are pretty much rhetorical, but let’s not lose sight of natural justice. There has been strong advocacy, and much procedural argument, but let’s also hear it for the facts, evidence, and setting matters straight.
Hoffman hasn’t returned to Wikipedia. Moving on, what do we learn about ArbCom 2009? The ArbCom, as of 2009, seems to be binding itself to operate in a more tightly constrained way, by placing emphasis in its Hoffman statement on procedural rather than evidential matters. We are back to justice, but this is more like the apparatus of the television lawyer drama. In fact the ArbCom was changing as of 2008, accepting many fewer cases than before, and we are now at perhaps 25% of the caseload numerically compared to the peak period in 2006/7. These cases are generally more complex, and take several times as long to close.
The bigger picture is of admins plus ArbCom in tension on the English Wikipedia, as a shifting relationship that went through an uneasy period in 2008. We are certainly seeing some movement at the moment.

Tech conference CFP season
The 2009 OSDC CfP is closing on 30th June. So you’ve got about two weeks left to get your act together. This year the Open Source Developers’ Conference will be held in November in Brisbane.
Secondly the 2010 LCA call for miniconf proposals is now open. This year, there will be 12 and they will all be one day long (previously half-day or two-day proposals were also accepted). I think this is for the best — one day is pretty much the right amount of time to fill.
Last year I ran the Free as in Freedom miniconf which was successful in its own right. I am pondering whether or not to propose it again. At the moment I am leaning towards no, because it would be rather a lot of work, especially as I’m not particularly familiar with the New Zealand situation (linux.conf.au will be in Wellington). OTOH maybe that is a good opportunity to find out what’s going on in NZ. I’ve got about four weeks to give it some thought.
I am also on the programme committee for this year’s LCA which will be a new and exciting experience. :)

Lifting the copyfraud veil from the public domain
Cornell University Library Removes All Restrictions on Use of Public Domain Reproductions. Apparently “the immediate impetus for the new policy is Cornell’s donation of more than 70,000 digitized public domain books to the Internet Archive". From the press release:
Institutional restrictions on the use of public domain work, sometimes labeled “copyfraud,” have been the subject of much scholarly criticism. The Cornell initiative goes further than many other recent attempts to open access to public domain material by removing restrictions on both commercial and non-commercial use.
See also:
I don’t have much else to say about this except that it’s awesome. I am interested to find out more about the donation to the Internet Archive. Was Cornell sold on the publicity or the free OCRing? Something else? How else can we volunteer the energy of the internets like this for good?
