[threeten-dev] Possible addition of pattern letters in CLDR
Dan Chiba
dan.chiba at oracle.com
Thu Dec 13 15:09:08 PST 2012
Hi Stephen, Yoshito,
For the discrepancies between 310 and CLDR, it would be ideal for us to
be able to resolve them one way or the other. Personally I look at them
as issues because they often cause problems in similar cases.
I don't think there is an issue in "I" in CLDR. Printing "I" out of an
offset is an attempt to format a nonexistent field. It is the same as
attempting a time field when the value is date only or vise versa or
anything similar.
Another possible difference is punctuations. LDML says "... certain
ASCII punctuation characters may become variable in the future (for
example, ":" being interpreted as the time separator and '/' as a date
separator, and replaced by respective locale-sensitive characters in
display)." I think locale sensitive pattern letters for punctuations are
desired, or formatting to the local user's expectation could be
difficult to achieve. In particular, the fractional seconds would be
difficult, as predefined locale sensitive patterns with fractional
seconds are hard to find.
Yoshito, would you let us know if you would like us to file a formal
request for new pattern letters, please?
Regards,
-Dan
On 5 December 2012 14:42,<scolebourne at joda.org <http://mail.openjdk.java.net/mailman/listinfo/threeten-dev>> wrote:
> On 5 December 2012 04:36,<yoshito_umaoka at us.ibm.com <http://mail.openjdk.java.net/mailman/listinfo/threeten-dev>> wrote:
> >/ "The pattern string is also similar, but not identical, to that defined by
> />/ the Unicode Common Locale Data Repository (CLDR)."
> /
> This is the relevent section of comments in the code
>
> FIELD_MAP.put('G', ChronoField.ERA); //
> Java, CLDR (different to both for 1/2 chars)
> FIELD_MAP.put('y', ChronoField.YEAR); // CLDR
> // FIELD_MAP.put('y', ChronoField.YEAR_OF_ERA); //
> Java, CLDR // TODO redefine from above
> // FIELD_MAP.put('u', ChronoField.YEAR); //
> CLDR // TODO
> // FIELD_MAP.put('Y', ISODateTimeField.WEEK_BASED_YEAR);
> // Java7, CLDR (needs localized week number) // TODO
> // FIELD_MAP.put('Q', QuarterYearField.QUARTER_OF_YEAR);
> // CLDR (removed quarter from 310)
> // FIELD_MAP.put('q', QuarterYearField.QUARTER_OF_YEAR);
> // CLDR (needs standalone data) // TODO
> FIELD_MAP.put('M', ChronoField.MONTH_OF_YEAR); // Java, CLDR
> // FIELD_MAP.put('L', ChronoField.MONTH_OF_YEAR); //
> Java, CLDR (needs standalone data) // TODO
> // FIELD_MAP.put('w',
> ISODateTimeField.WEEK_OF_WEEK_BASED_YEAR); // Java, CLDR (needs
> localized week number) // TODO
> FIELD_MAP.put('D', ChronoField.DAY_OF_YEAR); // Java, CLDR
> FIELD_MAP.put('d', ChronoField.DAY_OF_MONTH); // Java, CLDR
> FIELD_MAP.put('F', ChronoField.ALIGNED_WEEK_OF_MONTH); // Java, CLDR
> FIELD_MAP.put('E', ChronoField.DAY_OF_WEEK); //
> Java, CLDR (different to both for 1/2 chars)
> // FIELD_MAP.put('e', ChronoField.DAY_OF_WEEK); //
> CLDR (needs localized week number) // TODO
> // FIELD_MAP.put('c', ChronoField.DAY_OF_WEEK); //
> CLDR (needs standalone data) // TODO
> FIELD_MAP.put('a', ChronoField.AMPM_OF_DAY); // Java, CLDR
> FIELD_MAP.put('H', ChronoField.HOUR_OF_DAY); // Java, CLDR
> FIELD_MAP.put('k', ChronoField.CLOCK_HOUR_OF_DAY); // Java, CLDR
> FIELD_MAP.put('K', ChronoField.HOUR_OF_AMPM); // Java, CLDR
> FIELD_MAP.put('h', ChronoField.CLOCK_HOUR_OF_AMPM); // Java, CLDR
> FIELD_MAP.put('m', ChronoField.MINUTE_OF_HOUR); // Java, CLDR
> FIELD_MAP.put('s', ChronoField.SECOND_OF_MINUTE); // Java, CLDR
> FIELD_MAP.put('S', ChronoField.NANO_OF_SECOND); //
> CLDR (Java uses milli-of-second number)
> FIELD_MAP.put('A', ChronoField.MILLI_OF_DAY); // CLDR
> FIELD_MAP.put('n', ChronoField.NANO_OF_SECOND); // 310
> FIELD_MAP.put('N', ChronoField.NANO_OF_DAY); // 310
> // reserved - z,Z,X,I,p
> // Java - X - compatible, but extended to 4 and 5 letters
> // Java - u - clashes with CLDR, go with CLDR (year-proleptic) here
> // CLDR - U - cycle year name, not supported by 310 yet
> // CLDR - l - deprecated
> // CLDR - W - week-of-month following CLDR rules
> // CLDR - j - not relevant
> // CLDR - g - modified-julian-day
> // CLDR - z - time-zone names // TODO properly
> // CLDR - Z - different approach here // TODO bring 310 in
> line with CLDR
> // CLDR - v,V - extended time-zone names
> // 310 - I - time-zone id
> // 310 - p - prefix for padding
>
> If you can decode my comments you'll find a variety of differences.
>
> >/ With my quick review, following pattern letters are not available in LDML
> />/ specification
> />/
> />/ n nano-of-second number/fraction 987654321
> />/ N nano-of-day number/fraction 1234000000
> />/ I time-zone ID zoneID America/Los_Angeles
> />/
> />/ I think all of these make sense. In longer term, I don't want to see
> />/ different pattern definition across similar implementations. If these are
> />/ final, I'm happy to work on CLDR side and propose these pattern letters
> />/ registered in the LDML specification.
> /
> It would be good to see these in LDML. As of now, n and N are simple
> numbers, not fractions.
> (Previously, I used 'f' as a prefix turning the next letter into a fraction)
>
> However, "I" is interesting. In 310 we now have three underlying
> concepts offsets from UTC/Greenwich, regional time-zone IDs and
> general time-zone IDs (either an offset ID or a region ID). Most text
> formats would want to print both the offset and the time-zone ID, but
> they would not want to print the time-zone ID if it is an offset (as
> opposed to a region). Thus, "I" should probably be the subset of all
> time-zone IDs that represent regions. This works well in 310 where we
> have this distinction, but may be tricky in LDML.
>
>
> >/ X zone-offset 'Z' for zero offset-X Z; -0800; -08:00;
> />/
> />/ This one is tricky. We recently added ISO 8601 style offset format in the
> />/ LDML specification with "ZZZZZ"
> />/ LDML "ZZZZZ" formats UTC to "Z" and non-UTC offsets using longer format
> />/ such as "-08:00" (with colon).
> />/
> />/ Z zone-offset offset-Z +0000; -0800;
> />/ -08:00;
> />/
> />/ On the other hand, pattern letter "Z" in the LDML specification uses the
> />/ following definition-
> />/
> />/ Z, ZZ, ZZZ -> RFC822 format
> />/ ZZZZ -> Localized GMT format, such as "GMT-08:00" "UTC-08:00"...
> />/
> />/ I personally think CLDR may introduce pattern X, supporting both
> />/ long/short offset format, and deprecate "ZZZZZ" (or simply leave it as an
> />/ alias definition).
> />/
> />/ For offset format, CLDR community is now seeking for shorter offset format
> />/ support. For example, "GMT-3" instead of "GMT-03:00".
> /
> We don't have that right now.
>
> >/ Anyway, I'll consult with CLDR community members to find out if CLDR can
> />/ provide compatible definitions for these too.
> /
> Keep us informed. We may have some ability to tweak these patterns for
> the next few months, but it will be increasingly difficult.
>
> Stephen
More information about the threeten-dev
mailing list