As anyone who has tried to parse wikitext or even Wikipedia’s HTML will know, it’s not an easy task. Looks like the NLA needs to work on scrubbing references and ignoring disambiguation pages.
To see if they were caching data or pulling it live, I made a minor edit to the intro of James Spigelman and reloaded the NLA author page. To my surprise the change I had just made was updated, meaning they are pulling data live. (They are also pulling thumbnails from Commons, as in the Knuth bio, although the link to the image page has not been preserved.)
I suppose the NLA’s requests are like a fly on the back of Wikipedia, but still, it may not be a particularly good idea.
(via the Australian Wikipedians’ Notice Board)
Elsewhere on the web...
Commenting is closed for this article.