In the press release, their CIO says
Our agency is custodian of a vast range of valuable geological and spatial datasets that are used by the public sector and private sector industries in the exploitation of resources, management of the environment, safety of critical infrastructure and the resultant well-being of all Australians. The Creative Commons licence has created a more efficient process for them to access this valuable information.
Although looking around their website, it seems like various bits of their data you need to specially order or buy. I wonder if that will be changing as they update their website.
I’m not really up on “map stuff” but I am sure the attendees of the recent FOSS4G conference (Free and Open Source Software for Geospatial) in Sydney will be pleased about this.
Two nights ago I went to the first Freebase user meeting outside the US. (You can tell I’m setting myself up for a, “I was there when…”)
So, what is Freebase? It claims to be a “database of everything”. There are several points of comparison with Wikipedia. Where Wikipedia is an “encyclopedia”, Freebase wants to be “everything”. It is far more structured than Wikipedia (which anyone who’s ever wrangled with an esoteric template might appreciate). Like Wikipedia, it’s a free content project: data derived from Wikipedia is GFDL (natch) and everything else is CC-BY. They have a very excellent and well-documented API — they’re not afraid to share. Bring on the mash-ups!
There are several more differences worth discussing. Currently, Freebase is alpha and invitation-only for write permission (ie an account). No worries, give it time.
More importantly, the back-end. Freebase is built on Metaweb’s closed-source back-end that is going to remain that way. Apparently they intend to release some kind of regular data dump, and even allegedly would have no problem with someone taking that entire data set and throwing it into MySQL or what-have-you and setting up a total project fork.
If it was free software, there would be a right to fork. But this is only free content. Is there any kind of corresponding “right to fork” for a free content community? Should there be?
If not, maybe this joke from Evan about “crowdsourcing” is just a truth:
The other reason that I would wait until I had an entire data dump downloaded on my own disk before really barracking for Freebase is because I read their TOS:
5. API USE
We provide access to portions of the Site and Service through an API; for purposes of this Terms of Service, such access constitutes use of the Site and Service. You agree only to use the API as outlined in documentation provided by us on the Site. You may not use the API or any other features of the Site or Service to duplicate or copy the Site or Service.
Bummer. Although — here’s a thought — I wonder if that conflicts with the CC-BY?
(clause 8.e from CC-BY-3.0)
This License constitutes the entire agreement between the parties with respect to the Work licensed here. There are no understandings, agreements or representations with respect to the Work not specified here. Licensor shall not be bound by any additional provisions that may appear in any communication from You. This License may not be modified without the mutual written agreement of the Licensor and You.
It’s not quite viral freedom, but almost as good. It seems to me this nice clause would render their TOS impotent.
So, interesting to see what will happen there. It’s Wiki[p|m]edia that convinced me (and taught me) about the absolutely vital right to fork. That is an incredible freedom which is vastly underappreciated by the journalists who are generally impressed with Wikipedia’s “freeness” (meaning no ads, or free access). And as a project leader, any kind of project, that is what keeps you on your toes. Maybe it is a good benchmark for deciding if you want to be a contributor to a particular project. If management gets too heavy, you can keep them in line by threatening to exercise your right to fork. Yeah!
Back to Freebase… another related, interesting aspect will be watching the development of their community and how it will be managed. Where Wikipedia was pretty grass-roots, it seems like Freebase is top-heavy, for the moment at least. Letting go, giving up control and trusting the unwashed masses is a very difficult psychological moment for anyone (who’s not a Wikimedian). Trying to get those same unwashed masses to behave themselves is a whole other kettle of fish. When I first contemplated this for Freebase two night s ago I was filled with cynicism, until I remembered… The thing about Wikipedia is that it only works in practice. In theory, it can never work.
I should make that my mantra. Every time I get cynical about something, think about that idea again. It only works in practice.
Virgin Australia has been hit with a lawsuit for its use of a photograph from Flickr in an ad campaign. The girl in the photo is underage and her-friend-the-photographer naturally didn’t get any kind of model release before licensing the photo CC-BY on Flickr.
Lawrence Lessig has a copy of the lawsuit on his blog which explains why Creative Commons has been named as a party in the lawsuit. It basically amounts to “they didn’t explicitly warn me something like this could happen”.
My thoughts are that I’m glad Virgin is being sued over this. They were jerks to use this photo in the first place. I understand that stupid multinational corporations can use works I license under CC licenses, but I’m happy they’re being pulled into line. I think CC being named in the suit is just misguided, but maybe it won’t hurt for the licenses to be tested in court. :) Is a URL without a username sufficient attribution?
Second thought. This confirms my belief that conscientious photographers should avoid CC licensing photographs of people. I would never CC license a photo of my friends. Famous people are fair game.
Third thought. I hope this inspires CC users to read up what they’re actually agreeing to. Like something interesting I discovered: the version 1.0 licenses have this clause:By offering the Work for public release under this License, Licensor represents and warrants that, to the best of Licensor’s knowledge after reasonable inquiry:
1. Licensor has secured all rights in the Work necessary to grant the license rights hereunder and to permit the lawful exercise of the rights granted hereunder without You having any obligation to pay any royalties, compulsory license fees, residuals or any other payments;
2. The Work does not infringe the copyright, trademark, publicity rights, common law rights or any other right of any third party or constitute defamation, invasion of privacy or other tortious injury to any third party.
Hm, well that makes all my CC-BY-SA-1.0 releases invalid, because I sure as hell never checked those things. And I sure as hell don’t intend to. Happily, CC seems to agree that those things don’t in fact belong in copyright licenses.
On the cc-community mailing list, there has been a killer thread about what “NC” (non-commercial, as in “this photo can be used for non-commercial purposes”) means (entitled “What does NC means?”). Many people are confused about this, and CC doesn’t seem in any rush to clear up the confusion. They seem happy with the poorly defined but vaguely comforting terms. Terry Hancock writes eloquently here about how NC and ND licenses betray the tradition that the “commons” part of the Creative Commons name lays claim to.
There seem to be plenty of people within CC culture who are pissed about this, but CC doesn’t seem willing to act to even encourage people towards freer license terms. They emphasise the clarity of “choice” to the individual licensor at the expense of benefit to the commons they purport to help create. It is kinda annoying.
I am starting to think we need a http://www.NCandNDarenotfree.org/ with arguments and polite form letters that people can send to probably-misguided NC and ND license users. Especially people who set site-wide licenses, like wiki administrators: these people need a clip around the ear if they choose a NC or ND license. Well, first they need a persuasive argument, then if they persist, the clip. It could be like GNU’s campaign to end Word attachments, Although they appear to have lost the war, but small individual battles are won each day.
And the last mention must go to the recent iCommons iHeritage event, celebrating South African Heritage day. They were uploading media to Wikimedia Commons and Flickr. There is probably still a bit to go as they were recording audio as well. I helped out a bit by creating some help files on Wikimedia Commons.
I’m sure there is much more content on Flickr. I can’t really blame anyone who chose to upload there instead of Commons. I suppose the good thing is our Flickr transfer service making copying them over nice and easy. :)
Today I attended PacLing2007, the 10th Conference of the Pacific Association for Computational Linguistics. I attended sessions on Named Entities, Lexical Semantics, Machine Translation and Terminology. There was also an invited talk by Ann Copestake on applying robust semantics. She had a neat example of how underspecification works, in solving Sudoku, and how you can make inferences from something underspecified. Well it’s easy with sudoku, I wonder how easy it is with language. :)
There were two main interesting points to me. The first is that Francis Bond, the Program Chair, asked all the presenters to license their papers under the Creative Commons Attribution 3.0 license, and they did. All of the papers from the conference program are available under this liberal license. (The webpage doesn’t say so, but each paper’s PDF has this as a footnote on the first page.) I think this is a fantastic forward-thinking and commendable move on behalf of PacLing. It acknowledges that all human knowledge builds on what came before.
The second thing that was interesting was the session Bridging the Gap: Thai – Thai Sign Language Machine Translation , although in the end it was not perhaps a terribly exciting MT system. I was curious about how TSL was represented. Apparently they have a big dictionary of Thai word <-> photograph of someone making the equivalent TSL sign(s). Given that movement is a meaningful part of sign language I wonder how well this works. I am not sure now if the presenter told me that they slice up a video of the movement into frames to represent it, or if I imagined that. :)
I spoke to the presenter (I think it was Srisavakon Dangsaart) afterwards about signwriting, which she had heard of. She seemed to indicate it wasn’t used for TSL. I asked if it couldn’t be useful for TSL ‘speakers’ to be able to write using it. Her MT system is definitely useful and cool, but it’s basically one way: not really possible for TSL ‘speakers’ to create sentences using photographs of people making signs. She said it would mean they would have to learn three languages: TSL, signwriting, and written Thai (to communicate with the rest of the population). I don’t disagree, but I imagine it would be easier to learn to write Thai given literacy first in signwriting, which I presume would be an order of magnitude easier to acquire over any phonemic representation of a language (such as an alphabet-based script, which Thai is). That would be a fertile area for research I imagine.