[threeten-dev] Possible addition of pattern letters in CLDR
Stephen Colebourne
scolebourne at joda.org
Wed Dec 5 15:14:42 PST 2012
On 5 December 2012 04:36, <yoshito_umaoka at us.ibm.com> wrote:
> "The pattern string is also similar, but not identical, to that defined by
> the Unicode Common Locale Data Repository (CLDR)."
This is the relevent section of comments in the code
FIELD_MAP.put('G', ChronoField.ERA); //
Java, CLDR (different to both for 1/2 chars)
FIELD_MAP.put('y', ChronoField.YEAR); // CLDR
// FIELD_MAP.put('y', ChronoField.YEAR_OF_ERA); //
Java, CLDR // TODO redefine from above
// FIELD_MAP.put('u', ChronoField.YEAR); //
CLDR // TODO
// FIELD_MAP.put('Y', ISODateTimeField.WEEK_BASED_YEAR);
// Java7, CLDR (needs localized week number) // TODO
// FIELD_MAP.put('Q', QuarterYearField.QUARTER_OF_YEAR);
// CLDR (removed quarter from 310)
// FIELD_MAP.put('q', QuarterYearField.QUARTER_OF_YEAR);
// CLDR (needs standalone data) // TODO
FIELD_MAP.put('M', ChronoField.MONTH_OF_YEAR); // Java, CLDR
// FIELD_MAP.put('L', ChronoField.MONTH_OF_YEAR); //
Java, CLDR (needs standalone data) // TODO
// FIELD_MAP.put('w',
ISODateTimeField.WEEK_OF_WEEK_BASED_YEAR); // Java, CLDR (needs
localized week number) // TODO
FIELD_MAP.put('D', ChronoField.DAY_OF_YEAR); // Java, CLDR
FIELD_MAP.put('d', ChronoField.DAY_OF_MONTH); // Java, CLDR
FIELD_MAP.put('F', ChronoField.ALIGNED_WEEK_OF_MONTH); // Java, CLDR
FIELD_MAP.put('E', ChronoField.DAY_OF_WEEK); //
Java, CLDR (different to both for 1/2 chars)
// FIELD_MAP.put('e', ChronoField.DAY_OF_WEEK); //
CLDR (needs localized week number) // TODO
// FIELD_MAP.put('c', ChronoField.DAY_OF_WEEK); //
CLDR (needs standalone data) // TODO
FIELD_MAP.put('a', ChronoField.AMPM_OF_DAY); // Java, CLDR
FIELD_MAP.put('H', ChronoField.HOUR_OF_DAY); // Java, CLDR
FIELD_MAP.put('k', ChronoField.CLOCK_HOUR_OF_DAY); // Java, CLDR
FIELD_MAP.put('K', ChronoField.HOUR_OF_AMPM); // Java, CLDR
FIELD_MAP.put('h', ChronoField.CLOCK_HOUR_OF_AMPM); // Java, CLDR
FIELD_MAP.put('m', ChronoField.MINUTE_OF_HOUR); // Java, CLDR
FIELD_MAP.put('s', ChronoField.SECOND_OF_MINUTE); // Java, CLDR
FIELD_MAP.put('S', ChronoField.NANO_OF_SECOND); //
CLDR (Java uses milli-of-second number)
FIELD_MAP.put('A', ChronoField.MILLI_OF_DAY); // CLDR
FIELD_MAP.put('n', ChronoField.NANO_OF_SECOND); // 310
FIELD_MAP.put('N', ChronoField.NANO_OF_DAY); // 310
// reserved - z,Z,X,I,p
// Java - X - compatible, but extended to 4 and 5 letters
// Java - u - clashes with CLDR, go with CLDR (year-proleptic) here
// CLDR - U - cycle year name, not supported by 310 yet
// CLDR - l - deprecated
// CLDR - W - week-of-month following CLDR rules
// CLDR - j - not relevant
// CLDR - g - modified-julian-day
// CLDR - z - time-zone names // TODO properly
// CLDR - Z - different approach here // TODO bring 310 in
line with CLDR
// CLDR - v,V - extended time-zone names
// 310 - I - time-zone id
// 310 - p - prefix for padding
If you can decode my comments you'll find a variety of differences.
> With my quick review, following pattern letters are not available in LDML
> specification
>
> n nano-of-second number/fraction 987654321
> N nano-of-day number/fraction 1234000000
> I time-zone ID zoneID America/Los_Angeles
>
> I think all of these make sense. In longer term, I don't want to see
> different pattern definition across similar implementations. If these are
> final, I'm happy to work on CLDR side and propose these pattern letters
> registered in the LDML specification.
It would be good to see these in LDML. As of now, n and N are simple
numbers, not fractions.
(Previously, I used 'f' as a prefix turning the next letter into a fraction)
However, "I" is interesting. In 310 we now have three underlying
concepts offsets from UTC/Greenwich, regional time-zone IDs and
general time-zone IDs (either an offset ID or a region ID). Most text
formats would want to print both the offset and the time-zone ID, but
they would not want to print the time-zone ID if it is an offset (as
opposed to a region). Thus, "I" should probably be the subset of all
time-zone IDs that represent regions. This works well in 310 where we
have this distinction, but may be tricky in LDML.
> X zone-offset 'Z' for zero offset-X Z; -0800; -08:00;
>
> This one is tricky. We recently added ISO 8601 style offset format in the
> LDML specification with "ZZZZZ"
> LDML "ZZZZZ" formats UTC to "Z" and non-UTC offsets using longer format
> such as "-08:00" (with colon).
>
> Z zone-offset offset-Z +0000; -0800;
> -08:00;
>
> On the other hand, pattern letter "Z" in the LDML specification uses the
> following definition-
>
> Z, ZZ, ZZZ -> RFC822 format
> ZZZZ -> Localized GMT format, such as "GMT-08:00" "UTC-08:00"...
>
> I personally think CLDR may introduce pattern X, supporting both
> long/short offset format, and deprecate "ZZZZZ" (or simply leave it as an
> alias definition).
>
> For offset format, CLDR community is now seeking for shorter offset format
> support. For example, "GMT-3" instead of "GMT-03:00".
We don't have that right now.
> Anyway, I'll consult with CLDR community members to find out if CLDR can
> provide compatible definitions for these too.
Keep us informed. We may have some ability to tweak these patterns for
the next few months, but it will be increasingly difficult.
Stephen
More information about the threeten-dev
mailing list