[IIAB] Haitian Creole

Braddock Gaskill braddock at braddock.com
Fri Aug 9 13:42:19 PDT 2013


Hi Joel and Chris,

Joel, I did much of that same legwork last night.  I (erroneously) used
cpf instead of ht (I didn't see a language code for ht in the other
locales, but will look again).  However I made a copy of the fr.dat pickled
file as a new cpf.dat and wrote an (ugly) routine to load it into the babel
cache from iiab/cpf.dat.

See my commit from last night:
https://github.com/braddockcg/internet-in-a-box/commit/330f89eec500b9a75489e10c8dac925c0735a596

I agree with Chris that this is ugly, and we should get Haitian creole
fixed upstream.  I'm not sure how to do that.

-braddock


On Fri, 9 Aug 2013 13:12:52 -0700, Joel Steres <joel.steres at ymobility.com>
wrote:
> I'm looking at this pretty much for the first time, but I have a few
> observations.
> 
> First off Babel includes list of default mapping from short names like
> 'de' to 'de_DE' defined in a dict (locally defined defaults defined in
> LOCALE_ALIASES) that can be passed from outside the library.  However
> at first glance I didn't spot support for passing it in from the flask
> babel extension.
> 
> Second the babel library stores its data in a number of pickled data
> files. The HT designation is already included in global.dat file but
> the localedata directory does not include a file for ht or ht_HT. The
> localedata files "contain much of the data defined by the Common
> Locale Data Repository".  We should be able to create the needed files
> (it's just a pickled, compound dict) but there might not be a nice way
> to include it without pushing it upstream or including a patch script.
>  When I unpickled one of the files I did notice that the ht language
> and HT territory codes ARE already included in the data structures
> (each language file seems to embed a mapping from each language and
> territory code to a proper name).
> 
> We can look more closely if need be.
> 
> Oh, a couple of asides, the localedata is stored in
> site-packages/Babel-0.9.6-py2.7.egg/babel/localedata/. The reason
> there are two language files is that in cases where different
> territories speak the same language the language group defaults can be
> overridden.  For example en.dat described the US so en_US.dat is just
> a skeleton dictionary, whereas en_GB.dat has some values populated.
> 
> -j
> 
> 
> 
> On Thu, Aug 8, 2013 at 10:32 PM, Chris Leonard
<cjlhomeaddress at gmail.com>
> wrote:
>> On Thu, Aug 8, 2013 at 11:58 PM, Braddock <braddock at braddock.com>
wrote:
>>> -----BEGIN PGP SIGNED MESSAGE-----
>>> Hash: SHA1
>>>
>>> "cpf", while listed as a language, did not have a locale entry.
>>
>> No, James is not correct about the ISO code selection, sorry.
>>
>> in ISO-639 "cpf" is not a language, it is a language group.
>>
>> http://en.wikipedia.org/wiki/Creoles_and_pidgins
>>
>> The correct iso-639 code for Haitian Kreyol is either "ht"
>> (two-letter) or "hat" (three letter).  Standard locale nomenclature
>> would be ht_HT.
>>
>> http://en.wikipedia.org/wiki/Haitian_Creole
>>
>> You would no more use "cpf" for Kreyòl Ayisyen than you would use "roa
>> - Romance languages" for Spanish or French.  Same conecpt, a language
>> group.
>>
>> http://en.wikipedia.org/wiki/Romance_languages
>>
>>> There is no pretty way to get a new language into the Babel library.
>>> I did a bit of a hack to pre-load the generic French locale as "cpf" -
>>> I'm assuming they follow french conventions in dates and currency,
etc.
>>
>> We will need to create a CLDR locale to do this properly.  A glibc
>> locale exists for ht_HT, but it needs some work.  I'm afraid the
>> assumption that ht_HT uses the same dates and currency info as fr_FR
>> is also incorrect.
>>
>>> So Haitian Creole is now in IIAB.
>>
>> But "jammed in" in an unsustainable way at present. I will work on the
>> glibc and CLDR locales with Kreyol localizers to allow it to be called
>> properly, as ht-HT.
>>
>> Sorry to sound negative, but hacking i18n stuff is a dangerous
>> shortcut that ultimately leads to bad places we do not want to go.
>> Sadly it will take a little time to get it right, welcome to the
>> "bleeding edge" of L10n/i18n, it's where Sugar Labs lives...
>>
>> cjl
>> Sugar Labs Translation Team Coordinator
>>
>>
>>> On 08/08/2013 07:18 PM, James McDuffie wrote:
>>>> You should try and name it according to the ISO 639 language code
>>>> I believe:
>>>>
>>>> cpf --          Creoles and pidgins, French-based
>>>>
>>>> http://www.loc.gov/standards/iso639-2/php/code_list.php
>>>>
>>>> On Thu, Aug 8, 2013, at 04:33 PM, Braddock wrote: Chris and/or
>>>> James,
>>>>
>>>> When I add the Haitian Creole ht.po we are getting exceptions from
>>>> Python babel's library because it doesn't know the Haitian creole
>>>> 'ht' language code.
>>>>
>>>> It appears to depend on some underlying CLDR dataset which
>>>> contains the translations of language names into other languages
>>>> and a lot of other language information.
>>>>
>>>> Is there some standard way to inform the system of Haitian Creole?
>>>> Or is this a python specific thing?
>>>>
>>>> Thanks, Braddock
>>>>
>>> -----BEGIN PGP SIGNATURE-----
>>> Version: GnuPG v1.4.11 (GNU/Linux)
>>> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>>>
>>> iQEcBAEBAgAGBQJSBGjPAAoJEHWLR/DQzlZurvEH/ij0lTm+TOa/z439+a98U5uo
>>> zCBUgJRK53nz58GbEYTrcsm1y0vh5mS/FMqI5/FDhTDzH+0tkyKf6uZ1rm2hhvab
>>> ptpxN2cBQsLGeCgQinpHBf7rV3UIbzcWi+jRSE+IFhgijFl/R2mvvdjNvGge/vMp
>>> haNhx63U95ZhVg6/SJpNn04dLx/TxDJBJwiOiQhBQ2ekQ8oB2QLsB7AeYVNyCFmM
>>> vchB7AD8aYbrlTjA5bpyzD3LDhP9javCU/ZzW0gZL+0mQOK1PEEDd5HlKa+5gqbj
>>> o2G+bVbVc2gXl6VeFDkWPCh0AEZmyTRdiBbOFb06Xhv2BczDhpDOQ/4M0/+WJzQ=
>>> =PR3U
>>> -----END PGP SIGNATURE-----
>>



More information about the IIAB mailing list