[threeten-dev] Possible addition of pattern letters in CLDR

Dan Chiba dan.chiba at oracle.com
Thu Dec 13 15:09:08 PST 2012


Hi Stephen, Yoshito,

For the discrepancies between 310 and CLDR, it would be ideal for us to 
be able to resolve them one way or the other. Personally I look at them 
as issues because they often cause problems in similar cases.

I don't think there is an issue in "I" in CLDR. Printing "I" out of an 
offset is an attempt to format a nonexistent field. It is the same as 
attempting a time field when the value is date only or vise versa or 
anything similar.

Another possible difference is punctuations. LDML says "... certain 
ASCII punctuation characters may become variable in the future (for 
example, ":" being interpreted as the time separator and '/' as a date 
separator, and replaced by respective locale-sensitive characters in 
display)." I think locale sensitive pattern letters for punctuations are 
desired, or formatting to the local user's expectation could be 
difficult to achieve. In particular, the fractional seconds would be 
difficult, as predefined locale sensitive patterns with fractional 
seconds are hard to find.

Yoshito, would you let us know if you would like us to file a formal 
request for new pattern letters, please?

Regards,
-Dan

On 5 December 2012 14:42,<scolebourne at joda.org  <http://mail.openjdk.java.net/mailman/listinfo/threeten-dev>>  wrote:

> On 5 December 2012 04:36,<yoshito_umaoka at us.ibm.com  <http://mail.openjdk.java.net/mailman/listinfo/threeten-dev>>  wrote:
> >/  "The pattern string is also similar, but not identical, to that defined by
> />/  the Unicode Common Locale Data Repository (CLDR)."
> /
> This is the relevent section of comments in the code
>
>          FIELD_MAP.put('G', ChronoField.ERA);                       //
> Java, CLDR (different to both for 1/2 chars)
>          FIELD_MAP.put('y', ChronoField.YEAR);                      // CLDR
>          // FIELD_MAP.put('y', ChronoField.YEAR_OF_ERA);            //
> Java, CLDR  // TODO redefine from above
>          // FIELD_MAP.put('u', ChronoField.YEAR);                   //
> CLDR  // TODO
>          // FIELD_MAP.put('Y', ISODateTimeField.WEEK_BASED_YEAR);
>     // Java7, CLDR (needs localized week number)  // TODO
>          // FIELD_MAP.put('Q', QuarterYearField.QUARTER_OF_YEAR);
>     // CLDR (removed quarter from 310)
>          // FIELD_MAP.put('q', QuarterYearField.QUARTER_OF_YEAR);
>     // CLDR (needs standalone data)  // TODO
>          FIELD_MAP.put('M', ChronoField.MONTH_OF_YEAR);             // Java, CLDR
>          // FIELD_MAP.put('L', ChronoField.MONTH_OF_YEAR);          //
> Java, CLDR (needs standalone data)  // TODO
>          // FIELD_MAP.put('w',
> ISODateTimeField.WEEK_OF_WEEK_BASED_YEAR);  // Java, CLDR (needs
> localized week number)  // TODO
>          FIELD_MAP.put('D', ChronoField.DAY_OF_YEAR);               // Java, CLDR
>          FIELD_MAP.put('d', ChronoField.DAY_OF_MONTH);              // Java, CLDR
>          FIELD_MAP.put('F', ChronoField.ALIGNED_WEEK_OF_MONTH);     // Java, CLDR
>          FIELD_MAP.put('E', ChronoField.DAY_OF_WEEK);               //
> Java, CLDR (different to both for 1/2 chars)
>          // FIELD_MAP.put('e', ChronoField.DAY_OF_WEEK);            //
> CLDR (needs localized week number)  // TODO
>          // FIELD_MAP.put('c', ChronoField.DAY_OF_WEEK);            //
> CLDR (needs standalone data)  // TODO
>          FIELD_MAP.put('a', ChronoField.AMPM_OF_DAY);               // Java, CLDR
>          FIELD_MAP.put('H', ChronoField.HOUR_OF_DAY);               // Java, CLDR
>          FIELD_MAP.put('k', ChronoField.CLOCK_HOUR_OF_DAY);         // Java, CLDR
>          FIELD_MAP.put('K', ChronoField.HOUR_OF_AMPM);              // Java, CLDR
>          FIELD_MAP.put('h', ChronoField.CLOCK_HOUR_OF_AMPM);        // Java, CLDR
>          FIELD_MAP.put('m', ChronoField.MINUTE_OF_HOUR);            // Java, CLDR
>          FIELD_MAP.put('s', ChronoField.SECOND_OF_MINUTE);          // Java, CLDR
>          FIELD_MAP.put('S', ChronoField.NANO_OF_SECOND);            //
> CLDR (Java uses milli-of-second number)
>          FIELD_MAP.put('A', ChronoField.MILLI_OF_DAY);              // CLDR
>          FIELD_MAP.put('n', ChronoField.NANO_OF_SECOND);            // 310
>          FIELD_MAP.put('N', ChronoField.NANO_OF_DAY);               // 310
>          // reserved - z,Z,X,I,p
>          // Java - X - compatible, but extended to 4 and 5 letters
>          // Java - u - clashes with CLDR, go with CLDR (year-proleptic) here
>          // CLDR - U - cycle year name, not supported by 310 yet
>          // CLDR - l - deprecated
>          // CLDR - W - week-of-month following CLDR rules
>          // CLDR - j - not relevant
>          // CLDR - g - modified-julian-day
>          // CLDR - z - time-zone names  // TODO properly
>          // CLDR - Z - different approach here  // TODO bring 310 in
> line with CLDR
>          // CLDR - v,V - extended time-zone names
>          //  310 - I - time-zone id
>          //  310 - p - prefix for padding
>
> If you can decode my comments you'll find a variety of differences.
>
> >/  With my quick review, following pattern letters are not available in LDML
> />/  specification
> />/
> />/     n       nano-of-second              number/fraction   987654321
> />/     N       nano-of-day                 number/fraction   1234000000
> />/     I       time-zone ID                zoneID America/Los_Angeles
> />/
> />/  I think all of these make sense. In longer term, I don't want to see
> />/  different pattern definition across similar implementations. If these are
> />/  final, I'm happy to work on CLDR side and propose these pattern letters
> />/  registered in the LDML specification.
> /
> It would be good to see these in LDML. As of now, n and N are simple
> numbers, not fractions.
> (Previously, I used 'f' as a prefix turning the next letter into a fraction)
>
> However, "I" is interesting. In 310 we now have three underlying
> concepts offsets from UTC/Greenwich, regional time-zone IDs and
> general time-zone IDs (either an offset ID or a region ID). Most text
> formats would want to print both the offset and the time-zone ID, but
> they would not want to print the time-zone ID if it is an offset (as
> opposed to a region). Thus, "I" should probably be the subset of all
> time-zone IDs that represent regions. This works well in 310 where we
> have this distinction, but may be tricky in LDML.
>
>
> >/     X       zone-offset 'Z' for zero    offset-X          Z; -0800; -08:00;
> />/
> />/  This one is tricky. We recently added ISO 8601 style offset format in the
> />/  LDML specification with "ZZZZZ"
> />/  LDML "ZZZZZ" formats UTC to "Z" and non-UTC offsets using longer format
> />/  such as "-08:00" (with colon).
> />/
> />/     Z       zone-offset                 offset-Z          +0000; -0800;
> />/  -08:00;
> />/
> />/  On the other hand, pattern letter "Z" in the LDML specification uses the
> />/  following definition-
> />/
> />/  Z, ZZ, ZZZ ->  RFC822 format
> />/  ZZZZ ->  Localized GMT format, such as "GMT-08:00" "UTC-08:00"...
> />/
> />/  I personally think CLDR may introduce pattern X, supporting both
> />/  long/short offset format, and deprecate "ZZZZZ" (or simply leave it as an
> />/  alias definition).
> />/
> />/  For offset format, CLDR community is now seeking for shorter offset format
> />/  support. For example, "GMT-3" instead of "GMT-03:00".
> /
> We don't have that right now.
>
> >/  Anyway, I'll consult with CLDR community members to find out if CLDR can
> />/  provide compatible definitions for these too.
> /
> Keep us informed. We may have some ability to tweak these patterns for
> the next few months, but it will be increasingly difficult.
>
> Stephen


More information about the threeten-dev mailing list