Last year we introduced a feature we internally called 'message
sharing': basically a mechanism to directly share translations
between different releases of a same project or of the same
distribution.
That was a huge improvement in both usability (IMO, at least: you
translate a string in one release and it's instantly translated in
all the others) and allowed us to make Launchpad Translations much
more scalable (this one is very tangible). Eg. compared to one full
week it took us to "open" a new Ubuntu release for translations, it
took us full 25 minutes to do that for Karmic and 45 minutes for
Lucid.
However, while 'message sharing' has reduced duplication of efforts
a lot, it still happens: translators work at the same time upstream
and in Ubuntu, and might be translating exactly the same strings.
What can we do to solve that?
Importing latest upstream translations
Well, first off, Launchpad doesn't even know about latest upstream
translations. What it gets is upstream translations as they were
packaged in a tarball that is the base of an Ubuntu package.
However, that might mean very old translations. For instance,
perhaps there was no Ubuntu package re-upload for 3 months.
Translations upstream usually get committed directly to a VCS.
They'll flow into Ubuntu only when they get packaged into tarballs,
and those tarballs become basis for a new package in Ubuntu.
Today, maintainers decide when to release translations to the
world, and packagers decide what upstream releases go to Ubuntu
users
This means that there are two high bars for translations to flow
over before they can get into Ubuntu:
-
Upstream maintainers need to release a tarball with updated
translations
-
Ubuntu packagers need to prepare updated packages from these
tarballs, and sometimes they can't even do that (without merging
from VCS directly, because upstream might not be releasing
'translations updates' tarballs)
How about we eliminate these blockers with Launchpad?
So, we want Launchpad to directly import upstream translations from
their VCS of choice. Luckily, we can depend on our
amazing Launchpad
Code team and Bazaar
community to provide us with a bzr branch no matter what the
upstream VCS of choice is. And we already
have imports
from bzr branches, so we are all set, right?
Well, not exactly. Projects don't like to keep their generated
files in their repos. And for upstream projects, we can't really
ask them to (since we know it's a bad idea anyway). So, we need to
be able to generate templates (POT files) on the fly.
However, that is a very touchy job which depends on the upstream.
I.e. it's not the same thing if you are generating a template for
GNOME, KDE or regular GNU (gettext-using) project. And many a
script that needs to be run to do this could be very risky: intltool
itself has a number of obvious implementation details such that any
upstream committer would be able to take over the machine it was run
on. So, this has to happen in a safe, sandboxed environment.
Not surprisingly, Launchpad already has this
with Soyuz. We just need
to slightly modify it so we can run template generation
jobs on it.
We've split this into two separate steps: developing a library that
allows us to generate templates for a particular source code layout
(module named "pottery" inside the LP tree, currently only
supporting intltool layouts), and working on the infrastructure to
run these on the existing Launchpad build farm.
After translations are committed to upstream VCS, we should
import them into Launchpad asap
We are in the process of doing extensive QA on this code, and we
expect to roll it out next week. But, this is just a step of our
bigger vision.
As a side-note, this feature will also be useful for
intltool-based projects hosting their code and translations in
Launchpad: they won't have to keep POT files committed either.
In Ubuntu or in Launchpad
We could have gone one route and simply imported these upstream
translations directly into Ubuntu. It'd be a big win, but it
wouldn't work very well for those upstreams which are already in
Launchpad. And, since we are looking a bit further into the future,
there are other drawbacks to that approach as well (like being able
to send translations back upstream).
So, we decided that it's best to import them directly into Launchpad
projects, keep their upstream templates there for the future, but
keep those translations read-only.
Now, Launchpad internal database model already has a sort of
definition of "upstream", though it was never exactly so (which is
why we always struggled with the name: over time, the term went
from "published" to "imported", and now finally to "upstream").
Through many discussions
on different approaches, we decided to go
with the
fix is_imported flag one.
This will enable us to share translations directly between upstreams
in LP (and because of the feature that we are QAing right now, we'll
have latest upstream translations in there already, no matter where
project is hosted) and Ubuntu source packages.
The way we are going about this is very similar to message sharing
we have today. It's just that now different privileges come into
action as well, making it all suitably more complex to handle.
This is something that we
are actively
working on, and something that we hope to deliver in May.
Pushing latest imported translations into regular Ubuntu
language pack updates is the final stage
Before we can even consider calling this done, we'll have to do a
lot of testing. And we'll need help from community to get
everything set-up. First thing to do is to go around Launchpad and
make sure that for every source package with translations in Ubuntu
there is a linked upstream project, and that upstream project has a
trunk branch that syncs with the latest upstream source code.
Next, we'll really need some serious QA to happen. If you are no
stranger to Python code, checking
out Launchpad tree
and trying out pottery on all the intltool branches you can think of
would be very useful input.
Or, if there is your favourite i18n layout that you'd like us to
support, extending pottery and our auto-approver to deal with it
would be a very welcome addition.
Even going ahead and splitting pottery into a separate branch and
module would be nice, because it would make it more re-usable (for
instance, it could then be used in GNOME's damned-lies) and easier
to extend for people not directly interested in Launchpad.
And... How about giving back?
Ubuntu will get latest translations from upstreams then, which is
all pretty neat. But, how about contributing the translation fixes
back as well?
That is a natural next step. Having the latest templates and
translations in Launchpad will allow us to generate very precise
diffs between Ubuntu and upstream translations (i.e. we'll know what
string is Ubuntu-specific, and we'll know which translations are
newer). Then, we'll have to figure out how to submit those
upstream?
Should that happen automatically or should it be user-initiated?
How will Launchpad talk to each of the upsterams? Launchpad should
talk to every upstream as they prefer it, and that may mean
per-project, per-translation-team policies. But, I'll come
back to this topic once we have the foundation done with getting
latest upstream translations into Ubuntu.