Last year we introduced a feature we internally called 'message sharing': basically a mechanism to directly share translations between different releases of a same project or of the same distribution.
That was a huge improvement in both usability (IMO, at least: you translate a string in one release and it's instantly translated in all the others) and allowed us to make Launchpad Translations much more scalable (this one is very tangible). Eg. compared to one full week it took us to "open" a new Ubuntu release for translations, it took us full 25 minutes to do that for Karmic and 45 minutes for Lucid.
However, while 'message sharing' has reduced duplication of efforts a lot, it still happens: translators work at the same time upstream and in Ubuntu, and might be translating exactly the same strings.
What can we do to solve that?
Importing latest upstream translations
Well, first off, Launchpad doesn't even know about latest upstream translations. What it gets is upstream translations as they were packaged in a tarball that is the base of an Ubuntu package.
However, that might mean very old translations. For instance, perhaps there was no Ubuntu package re-upload for 3 months. Translations upstream usually get committed directly to a VCS. They'll flow into Ubuntu only when they get packaged into tarballs, and those tarballs become basis for a new package in Ubuntu.
Today, maintainers decide when to release translations to the world, and packagers decide what upstream releases go to Ubuntu users
This means that there are two high bars for translations to flow over before they can get into Ubuntu:
- Upstream maintainers need to release a tarball with updated translations
- Ubuntu packagers need to prepare updated packages from these tarballs, and sometimes they can't even do that (without merging from VCS directly, because upstream might not be releasing 'translations updates' tarballs)
How about we eliminate these blockers with Launchpad?
So, we want Launchpad to directly import upstream translations from their VCS of choice. Luckily, we can depend on our amazing Launchpad Code team and Bazaar community to provide us with a bzr branch no matter what the upstream VCS of choice is. And we already have imports from bzr branches, so we are all set, right?
Well, not exactly. Projects don't like to keep their generated files in their repos. And for upstream projects, we can't really ask them to (since we know it's a bad idea anyway). So, we need to be able to generate templates (POT files) on the fly.
However, that is a very touchy job which depends on the upstream. I.e. it's not the same thing if you are generating a template for GNOME, KDE or regular GNU (gettext-using) project. And many a script that needs to be run to do this could be very risky: intltool itself has a number of obvious implementation details such that any upstream committer would be able to take over the machine it was run on. So, this has to happen in a safe, sandboxed environment.
Not surprisingly, Launchpad already has this with Soyuz. We just need to slightly modify it so we can run template generation jobs on it.
We've split this into two separate steps: developing a library that allows us to generate templates for a particular source code layout (module named "pottery" inside the LP tree, currently only supporting intltool layouts), and working on the infrastructure to run these on the existing Launchpad build farm.
After translations are committed to upstream VCS, we should import them into Launchpad asap
We are in the process of doing extensive QA on this code, and we expect to roll it out next week. But, this is just a step of our bigger vision.
As a side-note, this feature will also be useful for intltool-based projects hosting their code and translations in Launchpad: they won't have to keep POT files committed either.
In Ubuntu or in Launchpad
We could have gone one route and simply imported these upstream translations directly into Ubuntu. It'd be a big win, but it wouldn't work very well for those upstreams which are already in Launchpad. And, since we are looking a bit further into the future, there are other drawbacks to that approach as well (like being able to send translations back upstream).
So, we decided that it's best to import them directly into Launchpad projects, keep their upstream templates there for the future, but keep those translations read-only.
Now, Launchpad internal database model already has a sort of definition of "upstream", though it was never exactly so (which is why we always struggled with the name: over time, the term went from "published" to "imported", and now finally to "upstream").
Through many discussions on different approaches, we decided to go with the fix is_imported flag one.
This will enable us to share translations directly between upstreams in LP (and because of the feature that we are QAing right now, we'll have latest upstream translations in there already, no matter where project is hosted) and Ubuntu source packages.
The way we are going about this is very similar to message sharing we have today. It's just that now different privileges come into action as well, making it all suitably more complex to handle.
This is something that we are actively working on, and something that we hope to deliver in May.
Pushing latest imported translations into regular Ubuntu language pack updates is the final stage
Before we can even consider calling this done, we'll have to do a lot of testing. And we'll need help from community to get everything set-up. First thing to do is to go around Launchpad and make sure that for every source package with translations in Ubuntu there is a linked upstream project, and that upstream project has a trunk branch that syncs with the latest upstream source code.
Next, we'll really need some serious QA to happen. If you are no stranger to Python code, checking out Launchpad tree and trying out pottery on all the intltool branches you can think of would be very useful input.
Or, if there is your favourite i18n layout that you'd like us to support, extending pottery and our auto-approver to deal with it would be a very welcome addition.
Even going ahead and splitting pottery into a separate branch and module would be nice, because it would make it more re-usable (for instance, it could then be used in GNOME's damned-lies) and easier to extend for people not directly interested in Launchpad.
And... How about giving back?
Ubuntu will get latest translations from upstreams then, which is all pretty neat. But, how about contributing the translation fixes back as well?
That is a natural next step. Having the latest templates and translations in Launchpad will allow us to generate very precise diffs between Ubuntu and upstream translations (i.e. we'll know what string is Ubuntu-specific, and we'll know which translations are newer). Then, we'll have to figure out how to submit those upstream?
Should that happen automatically or should it be user-initiated? How will Launchpad talk to each of the upsterams? Launchpad should talk to every upstream as they prefer it, and that may mean per-project, per-translation-team policies. But, I'll come back to this topic once we have the foundation done with getting latest upstream translations into Ubuntu.