Gettext message catalogues from virtual dir within PYZ for GtkBuilder widgets

684 Views Asked by At

Is there an established approach to embed gettext locale/xy/LC_MESSAGES/* in a PYZ bundle? Specifically to have Gtks automatic widget translation pick them up from within the ZIP archive.

For other embedded resources pkgutil.get_deta or inspect/get_source work well enough. But system and Python gettext APIs depend on bindtextdomain being supplied a plain old localedir; no resources or strings etc.

So I couldn't contrive a workable or even remotely practical workaround:

  1. Virtual gvfs/gio paths
    Now using archive://file%3A%2F%2Fmypkg.pyz%2Fmessages%2F IRIs would be an alternative to read other files directly from a zip. But glibs g_dgettext is still just a thin wrapper around the system lib. And therefore any such URLs can't be used as localedir.

  2. Partially extracting the zip
    That's how PyInstaller works I think. But it's of course somewhat ridiculous to bundle something as .pyz application, only to have it preextracted on each invocation.

  3. Userland gettext .mo/.po extraction
    Now reading out the message catalogues manually or just using trivial dicts instead would be an option. But only for in-application strings. That's again no way to have Gtk/GtkBuilder pick them up implicitly.
    Thus I had to manually traverse the whole widget tree, Labels, text, inner widgets, markup_text, etc. Possible, but meh.

  4. FUSE mounting
    This would be superflaky. But of course, the zip contents could be accessed gvfs-mount etc. Just seems like a certain memory hog. And I doubt it's gonna stay reliable with e.g. two app instances running, or a previous uncleanly terminated. (I don't know, due to a system library, like gettext, stumbling over a fragile zip fuse point..)

  5. Gtk signal/event for translation(?)
    I've found squat about this, so I'm somewhat certain there's no alternative mechanism for widget translations in Gtk/PyGtk/GI. Gtk/Builder expects and is tied to gettext.

Is there a more dependable approach perhaps?

1

There are 1 best solutions below

2
On

This my example Glade/GtkBuilder/Gtk application. I've defined a function xml_gettext which transparently translates glade xml files and passes to gtk.Builder instance as a string.

import mygettext as gettext
import os
import sys

import gtk
from gtk import glade

glade_xml = '''<?xml version="1.0" encoding="UTF-8"?>
<interface>
  <!-- interface-requires gtk+ 3.0 -->
  <object class="GtkWindow" id="window1">
    <property name="can_focus">False</property>
    <signal name="delete-event" handler="onDeleteWindow" swapped="no"/>
    <child>
      <object class="GtkButton" id="button1">
        <property name="label" translatable="yes">Welcome to Python!</property>
        <property name="use_action_appearance">False</property>
        <property name="visible">True</property>
        <property name="can_focus">True</property>
        <property name="receives_default">True</property>
        <property name="use_action_appearance">False</property>
        <signal name="pressed" handler="onButtonPressed" swapped="no"/>
      </object>
    </child>
  </object>
</interface>'''

class Handler:
    def onDeleteWindow(self, *args):
        gtk.main_quit(*args)

    def onButtonPressed(self, button):
       print('locale: {}\nLANGUAGE: {}'.format(
              gettext.find('myapp','locale'),os.environ['LANGUAGE']))

def main():
    builder = gtk.Builder()
    translated_xml = gettext.xml_gettext(glade_xml)
    builder.add_from_string(translated_xml)
    builder.connect_signals(Handler())

    window = builder.get_object("window1")
    window.show_all()

    gtk.main()

if __name__ == '__main__':
    main()  

I've archived my locale directories into locale.zip which is included in the pyz bundle.
This is contents of locale.zip

(u'/locale/fr_FR/LC_MESSAGES/myapp.mo',
 u'/locale/en_US/LC_MESSAGES/myapp.mo',
 u'/locale/en_IN/LC_MESSAGES/myapp.mo')

To make the locale.zip as a filesystem I use ZipFS from fs.

Fortunately Python gettext is not GNU gettext. gettext is pure Python it doesn't use GNU gettext but mimics it. gettext has two core functions find and translation. I've redefined these two in a seperate module named mygettext to make them use files from the ZipFS.

gettext uses os.path ,os.path.exists and open to find files and open them which I replace with the equivalent ones form fs module.

This is contents of my application.

pyzzer.pyz -i glade_v1.pyz  
# A zipped Python application
# Built with pyzzer

Archive contents:
  glade_dist/glade_example.py
  glade_dist/locale.zip
  glade_dist/__init__.py
  glade_dist/mygettext.py
  __main__.py

Because pyz files have text, usually a shebang, prepended to it, I skip this line after opening the pyz file in binary mode. Other modules in the application that want to use the gettext.gettext function, should import zfs_gettext instead from mygettext and make it an alias to _.

Here goes mygettext.py.

from errno import ENOENT
from gettext import _expand_lang, _translations, _default_localedir
from gettext import GNUTranslations, NullTranslations
import gettext
import copy
import os
import sys
from xml.etree import ElementTree as ET
import zipfile

import fs
from fs.zipfs import ZipFS


zfs = None
if zipfile.is_zipfile(sys.argv[0]):
    try:
        myself = open(sys.argv[0],'rb')
        next(myself)
        zfs = ZipFS(ZipFS(myself,'r').open('glade_dist/locale.zip','rb'))
    except:
        pass
else:
    try:
        zfs = ZipFS('locale.zip','r')
    except:
        pass
if zfs:
    os.path = fs.path
    os.path.exists = zfs.exists
    open = zfs.open

def find(domain, localedir=None, languages=None, all=0):

    # Get some reasonable defaults for arguments that were not supplied
    if localedir is None:
        localedir = _default_localedir
    if languages is None:
        languages = []
        for envar in ('LANGUAGE', 'LC_ALL', 'LC_MESSAGES', 'LANG'):
            val = os.environ.get(envar)
            if val:
                languages = val.split(':')
                break
                                                                                     if 'C' not in languages:
            languages.append('C')
    # now normalize and expand the languages
    nelangs = []
    for lang in languages:
        for nelang in _expand_lang(lang):
            if nelang not in nelangs:
                nelangs.append(nelang)
    # select a language
    if all:
        result = []
    else:
        result = None
    for lang in nelangs:
        if lang == 'C':
            break
        mofile = os.path.join(localedir, lang, 'LC_MESSAGES', '%s.mo' % domain)
        mofile_lp = os.path.join("/usr/share/locale-langpack", lang,
                               'LC_MESSAGES', '%s.mo' % domain)

        # first look into the standard locale dir, then into the 
        # langpack locale dir

        # standard mo file
        if os.path.exists(mofile):
            if all:
                result.append(mofile)
            else:
                return mofile

        # langpack mofile -> use it
        if os.path.exists(mofile_lp): 
            if all:
                result.append(mofile_lp)
            else:
               return mofile

        # langpack mofile -> use it
        if os.path.exists(mofile_lp): 
            if all:
                result.append(mofile_lp)
            else:
                return mofile_lp

    return result

def translation(domain, localedir=None, languages=None,
                class_=None, fallback=False, codeset=None):
    if class_ is None:
        class_ = GNUTranslations
    mofiles = find(domain, localedir, languages, all=1)
    if not mofiles:
        if fallback:
            return NullTranslations()
        raise IOError(ENOENT, 'No translation file found for domain', domain)
    # Avoid opening, reading, and parsing the .mo file after it's been done
    # once.
    result = None
    for mofile in mofiles:
        key = (class_, os.path.abspath(mofile))
        t = _translations.get(key)
        if t is None:
            with open(mofile, 'rb') as fp:
                t = _translations.setdefault(key, class_(fp))
        # Copy the translation object to allow setting fallbacks and
        # output charset. All other instance data is shared with the
        # cached object.
        t = copy.copy(t)
        if codeset:
            t.set_output_charset(codeset)
        if result is None:
            result = t
        else:
            result.add_fallback(t)
    return result

def xml_gettext(xml_str):
    root = ET.fromstring(xml_str)
    labels = root.findall('.//*[@name="label"][@translatable="yes"]')
    for label in labels:
        label.text = _(label.text)
    return ET.tostring(root)

gettext.find = find
gettext.translation = translation
_ = zfs_gettext = gettext.gettext

gettext.bindtextdomain('myapp','locale')
gettext.textdomain('myapp')

The following two shouldn't be called because glade doesn't use Python gettext.

glade.bindtextdomain('myapp','locale')
glade.textdomain('myapp')