February 2017 ULI-TC meeting

posted Feb 14, 2017, 5:58 PM by Steven R. Loomis

After a great January kickoff, our second meeting of 2017 will commence next Monday. Time in your local time zone:

Word Count effort begun

posted Feb 14, 2017, 5:47 PM by Steven R. Loomis

One of the challenges of translation interoperability is objectively measuring the difficulty of a particular translation workload. A common metric used is the word count. However, methods for counting words vary across different systems and languages. Some examples: Thai is written without space characters between words, as is Japanese and Chinese. Should numbers be included or not included? Are Mongolian suffixes considered a separate word or not?

The ULI-TC is hosting the development of a future Unicode technical note, you may follow and contribute to the discussion on this Github page.

Publicly Available Specifications

posted Oct 27, 2015, 11:33 PM by Steven R. Loomis

The ULI project now hosts documents which are archived for historical purposes and do not specify a Unicode standard. They are located at

ULI Segment Exceptions Posted in SVN and Demo Updated

posted Jan 18, 2013, 11:22 AM by Helena Shih   [ updated Jan 18, 2013, 2:26 PM by Steven R. Loomis ]

The latest ULI segmentation exception has been posted in SVN: including:
  • Reference to CLDR date/month and other necessary symbols
  • Available in JSON (json-cooked) format.  The XLS files contain the input data including exception type and frequency.
The latest demo is also updated to reflect the changes above. Try it out yourself at

Things to try:
  • Compare ULI vs. non-ULI version of the English sample text. Note breaks after "Mr." in the non-uli format.
  • For German ULI, try the string "Im Okt. München war kalt."  ( Something like, in October, Munich was cold. )   Without ULI the sentence breaks after the abbreviation Okt (for Oktober).  With ULI and with CLDR data,  "Okt." is an exception.

New Mailing List: uli-users

posted Mar 27, 2012, 7:54 AM by Kevin Lenzo

A public mailing list has been created for discussion of unicode, localization, and interoperability as on This mailing list is intended for broad-based conversations on the topics, and is not limited to the members of the Unicode consortium.

To subscribe to uli-users, follow the directions hereThe process is managed under the regular Unicode process, as described here.

LocalizationWorld Preconference Day on Unicode in Paris Jun 4 2012

posted Feb 14, 2012, 11:41 AM by Helena Shih   [ updated Feb 16, 2012, 7:15 PM ]

We are pleased to announce that Localization World is organizing a one-day Unicode workshop on Unicode, including an introduction with Richard Ishida and three additional sessions. This will take place on the preconference day, June 4, 2012, in Paris. Richard is an experienced presenter at Unicode conferences, and is well known for his clear and effective presentations.

The Unicode Consortium’s goal is to enable people around the world to use computers in any language. The Consortium is involved in core internationalization specifications at the heart of all modern software, such as the Unicode Standard for character encoding. The Consortium’s involvement in localization is a key extension of this work. The Unicode Consortium maintains and extends the Common Data Locale Repository (CLDR), and in 2011 established the Unicode Localization Interoperability Technical Committee to improve the interoperability of localization data interchange.

For more information, including the program of the June LocalizationWorld Conference, please see
Ulrich Henes, Donna Parrish and Daniel Goldschmidt, chair, vice-chairs, Localization World Conference Program Committee
Helena Chapman, chair, Unicode Localization Interoperability Technical Committee

ULI TC Kick-off

posted May 23, 2011, 9:52 AM by Helena Shih

The Unicode Consortium recently announced a new technical committee focusing on standards for data interoperability of critical localization-related assets, such as language segmentation, translation source strings, translated strings, and translation memories. ( The Unicode Localization Interoperability (ULI) Technical Committee kick-off meeting has been scheduled for Friday June 10, 2011 at 11am Eastern Time. It will be conducted by phone. Meeting material will be available at

There will be information about ULI, the initial focus area, next steps, and how you can be involved in this effort to help mature localization industry standards to meet your business needs. To receive the kick-off invitation, please use the reporting form to indicate your interest in ULI at and your contact information.

For additional information about ULI, see

Participate and make global information exchange work better

posted May 13, 2011, 12:43 PM by Helena Shih

Dear Unicode Community.

As business moves to broader market space, enable offerings and services to be available in those markets also becomes vital to the success and survival of any organizations. According to a 2009 European Union study, the language industry’s annual compounded growth rate was estimated at 10% minimum over the next few years, resulting in approximate value of 16.5 billion to 20 billion € in 2015. Unicode's stated mission is to "enables people around the world to use computers in any language". Technical standard development forms the foundation of desired interoperability for achieving that mission. Ultimately, the benefits of information and technology should be available at the end user level in the friendliest interface possible and to:

1. Gather requirements for core and extension of the specified standards in the areas of text segmentation and content memory,
2. Establish core specification scope, extension and implementations to improve the usefulness of existing standards and profiles for interoperability,
3. Provide consistent interpretation of the specification, extension and profiles.

With the recent announcement of the Unicode Localization Interoperability (ULI) Technical Committee, I would like to invite you to the kick off of ULI and help shape the scope and plan for this TC. To receive the kick-off invitation, please report your interests in ULI at with your contact information.

Thank you and look forward to speaking with you.

Unicode Consortium announces Localization Interoperability Technical Committee

posted May 2, 2011, 10:45 AM by Helena Shih

The Unicode Consortium announces a new technical committee, the Unicode Localization Interoperability Technical Committee. Localization of software information is a key part of the adoption of most software offerings in many countries. The purpose of the new committee is to ensure interoperable data interchange for critical localization-related assets, such as language segmentation, translation source strings, translated strings, and translation memories.

The initial focus of the Unicode Localization Interoperability (ULI) Technical Committee is on the improved interoperability of translation memories in the TMX format, segmentation rules that use the SRX format, and translation source strings and resulting translated strings that use the XLIFF format.

The ULI Technical Committee will establish profiles of use for TMX, SRX, and XLIFF. The committee will develop and publish specifications that document specific usage conventions that can be shared for interoperability. This will improve data interchange through more consistent implementations and will enhance the usefulness of these three standards.

For information on how to join the ULI effort and get involved in its work, contact the Unicode Consortium with the contact form (see and ask about the ULI.

To become a voting participant in the work of the ULI committee, join Unicode in one of the three voting categories of membership: Full, Institutional, or Supporting. See .

For more details about the ULI, see:

1-9 of 9