Scripting is easy, maintenance is hard

Daily-image-l is a mailing list that I started in February 2007. It is the Commons counterpart to Wikipedia’s daily-article-l which I believe is still lovingly crafted by hand. (Here’s today’s, and apparently it has expanded to encompass Wiktionary and Wikiquote’s daily offerings.)

Of course, mailing people an actual image each day is quickly going to end up as quite a bit of resources, especially considering the recipients won’t necessarily even look at it. So I decided to just send people a link and the brief description, and let them choose whether or not to click through. Keeping with Wikimedia Commons’ multilinguality I also decided to make daily-image-l multilingual, and all the different language captions come in a single email. Here is today’s, as rendered by Gmail:

I guess I initially crafted the daily-image-l emails by hand, but you don’t need too many pattern recognition skills to see that this email is something a machine can do. It’s a Python script that uses wget to get the HTML of the page for the descriptions (hacking it apart with regexes), the MediaWiki API for the license and category info, and sendmail to bundle it off to Mailman. And if I was doing it today? I don’t think it would be much different.

Once I got the thing working consistently, it really didn’t require much maintenance. It just does its thing. I just try to notice when something goes wrong. “Noticing” is more difficult than you might think, namely because of this:

Mailman -owner spam!

In theory, if you admin a Mailman mailing list with the address foo at lists.bar.com, the list subscribers will be able to reach you by emailing foo-owner at lists.bar.com. However in practice, if your list is even remotely public or remotely old, anything you write to this address will never be seen by the list admin, because 99% of what they receive to it is spam.

In case you missed it, two of these messages are valid:

From: [removed]
To: <daily-image-l-bounces@lists.wikimedia.org>
Date: Tue, 7 Jul 2009 11:39:59 +0200
Subject: help
non mi arrivano più le mail con l'immagini del giorno
la ringrazio anticipatamente

(this is buried in the message with the subject “Notifica errore non riconosciuta”)

From: [removed]
To: <daily-image-l-bounces@lists.wikimedia.org>
Date: Tue, 7 Jul 2009 07:04:50 +0200
Subject: AW: Datenschutz-Warnung von Mailman
Sehr geehrter "Mailman",
obwohl ich selbst es war, der diesen erneuten Abonnementsantrag für die
Mailingliste daily-image-l@lists.wikimedia.org gestellt habe, betrachte ich
dieses Schreiben nicht ganz als gegenstandslos.
Denn seit ca. 5 Tagen funktioniert die tägliche Mail für das "Bild des
Tages" bei mir nicht mehr. Die tägliche Mail bleibt einfach aus. Was ist da
los?
Vielleicht können Sie weiterhelfen.
Vielen Dank und freundliche Grüße
[name]

I don’t know why people like to write to the bounce address. Or are they just hitting reply, trying to post to the mailing list, and then getting this bounce, which for some reason is forwarded to -owner? And beyond this I don’t speak Italian or German anyway.

Luckily, eventually, someone leaves a comment in the right place, which is http://commons.wikimedia.org/wiki/Commons_talk:Daily-image-l. Although I don’t check that every day, RSS comes to the rescue!:

which leads me to inspect the July archive and indeed something has gone awry. It’s 8th July but there are only posts for the first 3 days. (And I actually paid attention it might not have been such a surprise.)

Now this script is run from the toolserver — the stable toolserver, in fact. Stable toolserver was set up to run allegedly “stable” projects with multiple caretakers or maintainers. I agreed to be a maintainer for the poty project with Bryan (this software was used to conduct Picture of the Year voting in 2007 and 2008). Because I already had a stable toolserver account (as opposed to regular toolserver), it seemed easiest to set up my daily-image-l script on stable as well. Bryan agreed to be a maintainer for that (and he actually did make some improvements :)) and stable project potd was born.

But I basically haven’t touched it for what seems like years. So to try and find out what had happened this time I had to dredge all the bits and pieces back together from the depths of my memory:

An hour or so later and I am pretty sure daily-image-l will return to its regular programming. (So to speak.)

While all this useful information is fresh in my brain, I think I will try and put a copy of this measly script in SVN. daily-image-l now has over 2,500 subscribers, which is pretty neat considering the MEAN amount of work I do on it each day is 0 seconds. Better to put it in source control before any crisis hits.

I think I’m trying to teach myself something. The moral is: authoring code is finite, but maintenance is forever. Do yourself a favour and document how all the bits bolt together. Because if you don’t have a sysadmin at your beck and call trying to piece it all together from stray emails will be really irritating!

09 July, 2009 •

Comment

1

Ah, so much fun in being a list owner… the spam, the out-of-office replies…

It seems ages to me as well that I touched that script. I’m glad you were eventually able to fix it.
Maybe we should put this thing in Wikimedia SVN? Do you have an account for that?

Anyway, I so much agree with you that we got a pretty good result-over-work ratio :D

Bryan · 11. July 2009, 07:11

2

Yeah I want to put it in SVN. I don’t believe I have an account. I think I have to make a JIRA request to get a repo created. Unless you can somehow use an existing one??

pfctdayelise · 11. July 2009, 14:14

3

We could use the Wikimedia/MediaWiki SVN at svn.wikimedia.org.

Bryan · 12. July 2009, 05:34

Elsewhere on the web...

Commenting is closed for this article.

list of all posts, ever

find articles by tag

monthly archive

most popular articles

  1. [guest] Rethinking the Top Ten
  2. How to use Gmail to manage high-traffic mailing lists
  3. NLA Innovative Ideas Forum audio/video now available
  4. An alternative term for "User-generated content"
  5. Write API enabled on Wikimedia sites!
  6. Top 10 software extensions Wikimedia Commons needs in 2008
  7. Is mass collaboration all it's cracked up to be?
  8. GLAM-WIKI, day one
  9. Free MediaWiki hosting offered by Dreamhost Apps
  10. Reflections on PGIP phase 1

(from the last 30 days)