[threeten-dev] Possible addition of pattern letters in CLDR

Stephen Colebourne scolebourne at joda.org
Wed Dec 5 15:14:42 PST 2012


On 5 December 2012 04:36,  <yoshito_umaoka at us.ibm.com> wrote:
> "The pattern string is also similar, but not identical, to that defined by
> the Unicode Common Locale Data Repository (CLDR)."

This is the relevent section of comments in the code

        FIELD_MAP.put('G', ChronoField.ERA);                       //
Java, CLDR (different to both for 1/2 chars)
        FIELD_MAP.put('y', ChronoField.YEAR);                      // CLDR
        // FIELD_MAP.put('y', ChronoField.YEAR_OF_ERA);            //
Java, CLDR  // TODO redefine from above
        // FIELD_MAP.put('u', ChronoField.YEAR);                   //
CLDR  // TODO
        // FIELD_MAP.put('Y', ISODateTimeField.WEEK_BASED_YEAR);
   // Java7, CLDR (needs localized week number)  // TODO
        // FIELD_MAP.put('Q', QuarterYearField.QUARTER_OF_YEAR);
   // CLDR (removed quarter from 310)
        // FIELD_MAP.put('q', QuarterYearField.QUARTER_OF_YEAR);
   // CLDR (needs standalone data)  // TODO
        FIELD_MAP.put('M', ChronoField.MONTH_OF_YEAR);             // Java, CLDR
        // FIELD_MAP.put('L', ChronoField.MONTH_OF_YEAR);          //
Java, CLDR (needs standalone data)  // TODO
        // FIELD_MAP.put('w',
ISODateTimeField.WEEK_OF_WEEK_BASED_YEAR);  // Java, CLDR (needs
localized week number)  // TODO
        FIELD_MAP.put('D', ChronoField.DAY_OF_YEAR);               // Java, CLDR
        FIELD_MAP.put('d', ChronoField.DAY_OF_MONTH);              // Java, CLDR
        FIELD_MAP.put('F', ChronoField.ALIGNED_WEEK_OF_MONTH);     // Java, CLDR
        FIELD_MAP.put('E', ChronoField.DAY_OF_WEEK);               //
Java, CLDR (different to both for 1/2 chars)
        // FIELD_MAP.put('e', ChronoField.DAY_OF_WEEK);            //
CLDR (needs localized week number)  // TODO
        // FIELD_MAP.put('c', ChronoField.DAY_OF_WEEK);            //
CLDR (needs standalone data)  // TODO
        FIELD_MAP.put('a', ChronoField.AMPM_OF_DAY);               // Java, CLDR
        FIELD_MAP.put('H', ChronoField.HOUR_OF_DAY);               // Java, CLDR
        FIELD_MAP.put('k', ChronoField.CLOCK_HOUR_OF_DAY);         // Java, CLDR
        FIELD_MAP.put('K', ChronoField.HOUR_OF_AMPM);              // Java, CLDR
        FIELD_MAP.put('h', ChronoField.CLOCK_HOUR_OF_AMPM);        // Java, CLDR
        FIELD_MAP.put('m', ChronoField.MINUTE_OF_HOUR);            // Java, CLDR
        FIELD_MAP.put('s', ChronoField.SECOND_OF_MINUTE);          // Java, CLDR
        FIELD_MAP.put('S', ChronoField.NANO_OF_SECOND);            //
CLDR (Java uses milli-of-second number)
        FIELD_MAP.put('A', ChronoField.MILLI_OF_DAY);              // CLDR
        FIELD_MAP.put('n', ChronoField.NANO_OF_SECOND);            // 310
        FIELD_MAP.put('N', ChronoField.NANO_OF_DAY);               // 310
        // reserved - z,Z,X,I,p
        // Java - X - compatible, but extended to 4 and 5 letters
        // Java - u - clashes with CLDR, go with CLDR (year-proleptic) here
        // CLDR - U - cycle year name, not supported by 310 yet
        // CLDR - l - deprecated
        // CLDR - W - week-of-month following CLDR rules
        // CLDR - j - not relevant
        // CLDR - g - modified-julian-day
        // CLDR - z - time-zone names  // TODO properly
        // CLDR - Z - different approach here  // TODO bring 310 in
line with CLDR
        // CLDR - v,V - extended time-zone names
        //  310 - I - time-zone id
        //  310 - p - prefix for padding

If you can decode my comments you'll find a variety of differences.

> With my quick review, following pattern letters are not available in LDML
> specification
>
>    n       nano-of-second              number/fraction   987654321
>    N       nano-of-day                 number/fraction   1234000000
>    I       time-zone ID                zoneID America/Los_Angeles
>
> I think all of these make sense. In longer term, I don't want to see
> different pattern definition across similar implementations. If these are
> final, I'm happy to work on CLDR side and propose these pattern letters
> registered in the LDML specification.

It would be good to see these in LDML. As of now, n and N are simple
numbers, not fractions.
(Previously, I used 'f' as a prefix turning the next letter into a fraction)

However, "I" is interesting. In 310 we now have three underlying
concepts offsets from UTC/Greenwich, regional time-zone IDs and
general time-zone IDs (either an offset ID or a region ID). Most text
formats would want to print both the offset and the time-zone ID, but
they would not want to print the time-zone ID if it is an offset (as
opposed to a region). Thus, "I" should probably be the subset of all
time-zone IDs that represent regions. This works well in 310 where we
have this distinction, but may be tricky in LDML.


>    X       zone-offset 'Z' for zero    offset-X          Z; -0800; -08:00;
>
> This one is tricky. We recently added ISO 8601 style offset format in the
> LDML specification with "ZZZZZ"
> LDML "ZZZZZ" formats UTC to "Z" and non-UTC offsets using longer format
> such as "-08:00" (with colon).
>
>    Z       zone-offset                 offset-Z          +0000; -0800;
> -08:00;
>
> On the other hand, pattern letter "Z" in the LDML specification uses the
> following definition-
>
> Z, ZZ, ZZZ -> RFC822 format
> ZZZZ -> Localized GMT format, such as "GMT-08:00" "UTC-08:00"...
>
> I personally think CLDR may introduce pattern X, supporting both
> long/short offset format, and deprecate "ZZZZZ" (or simply leave it as an
> alias definition).
>
> For offset format, CLDR community is now seeking for shorter offset format
> support. For example, "GMT-3" instead of "GMT-03:00".

We don't have that right now.

> Anyway, I'll consult with CLDR community members to find out if CLDR can
> provide compatible definitions for these too.

Keep us informed. We may have some ability to tweak these patterns for
the next few months, but it will be increasingly difficult.

Stephen


More information about the threeten-dev mailing list