[threeten-dev] CLDR Islamic calendar types

yoshito_umaoka at us.ibm.com yoshito_umaoka at us.ibm.com
Mon Jan 28 14:25:49 PST 2013


> Ok, I was only aware of BCP47:
>  extension     = singleton 1*("-" (2*8alphanum))

Yes, this is the BCP 47 base syntax.

> Is it correct that the Unicode extension tightens this rule to 3 to 
> 8 letters (in RFC6067 section 2.1, first bullet)?

Complete syntax definition for u (and t) extension is found at section 3 
in the LDML specification:

http://www.unicode.org/reports/tr35/#Unicode_Language_and_Locale_Identifiers

The Unicode locale extension may consist from attributes and keywords.
Each keyword consists from key and optional type.

We want to support well-formedness check without having complete subtag 
repository, we decided to limit keys to be always 2 letter code, and 
attributes and type subtags to be 3 to 8 characters. For example, 
following language tag is well-formed (but not valid because of subtags 
are not registered in the locale extension registry).

en-u-abc-xyz-aa-bbb-ccc-dd-ee-fff

Above is parsed into -

language: en

Unicode extension:
  attribute: abc
  attribute: xyz
  keyword: key=aa / value=bbb-ccc
  keyword: key=dd
  keyword: key=ee / value=fff

As you many know, the parser is implemented in java.util.Locale class in 
JDK 7.

> Yes, I think this is helpful, although currently we only plan to 
> support "sa0" for this kind.

I'm wondering what is the implication of Islamic Calendar based on 
sighting used in Saudi Arabia in the context of JSR-310.
There is no good way to calculate future dates with this - of course, it 
might be still valid for past dates (as long as Java is providing updates 
when new month is officially determined in Saudi Arabia). How would you 
implement this as a chronology in the JSR-310 framework?

In my honest opinion, I'm not sure it is worth distinguishing one sighting 
based dates used at region A from another sighting based dates used at 
region B. Neither of them is predictable by software. Isn't it better to 
just put them into one bucket as - sighting based/religious calendar?

> I think it is great if CLDR can define cv values in bcp47/*.xml.

Of course, when they are added into the registry, some explanation of the 
types should be provided.

> Would it be reasonable to have a longer description as well, so more
> information can be added? If this long description field can carry a
> reference to the full specification, no external registry would be 
necessary.

It's obviously impossible to spell out the complete algorithm there.
Of course, we try to eliminate ambiguity as much as possible, in short 
description.
For example, the registry does not try to explain what is Gregorian 
calendar.

-Yoshito



More information about the threeten-dev mailing list