[threeten-dev] zidtext parser

Xueming Shen xueming.shen at oracle.com
Sun Jan 13 16:36:59 PST 2013


On 1/13/13 3:02 PM, Stephen Colebourne wrote:
> Reading through the whole of
> http://www.unicode.org/reports/tr35/#Time_Zone_Fallback (Parsing
> section) is a big job that I'm struggling to find time for.
>
> One thought is whether the metazone concept should be exposed to users
> in the future, but I don't think it affects the API now. A second
> thought is how does this work for new zone IDs added between JDK
> releases (as you are hard coding the metazone info - should metazones
> be in TZDB.jar?)

I don't see the necessity of exposing the metazone info to the users, at
least for now.
It's a combination of the alias/link from the tzdb and the metazone from
the cldr. I would expect we will/should get the metazone from the utility
class we currently get those zone name info (the jdk cldr code currently
reads metaZone.xml already, just need to read a little more). It's hard
coded for now simply because I can't get it from those utilities. It's 
tricky
to handle the alias/link info. It's in the tzdb already, just need a way to
access it via the provider.

Another issue is getAvailableZoneId(). We are not using the "size" to
decide whether or not there is no ids installed. Does it have to return
a "modified" map? It just appears to be too expensive to invoke this
method everything we need to parse a zoneid.

>
> A brief read of the CLDR document suggests that we should accept "UT"
> as well as "UTC" and "GMT" as prefixes for the offset-based IDs. We
> should also accept a number without + as a positive number.
>
> Not sure how you will specify this, or whether it can be left open
> enough to expand in the future.
This can just be added if needed. I don't think it's an issue if a future
release accepts form that currently being rejected.

-Sherman

> Stephen
>
>
>
> On 11 January 2013 02:44, Xueming Shen <xueming.shen at oracle.com> wrote:
>> Hi,
>>
>> One helper class ZoneName, which is based on the metazone data
>> info from metaZone.xml [1] and the zone aliase info from the Link
>> entries of all the tzdb data files, is added to help implement the
>> parsing logic suggested by cldr. The reason we have to have the
>> the zone alias info is that it appears metaZone.xml uses both old
>> name and new name in different places.
>>
>> For each zid/zname candidate the implementation now looks up
>> the metazone table to see if it has a metazone specified for it, if
>> yes, it tries to see if there is a preferred zid defined for a particular
>> locale/region, if yes and if this locale/region matches parser's locale,
>> that preferred zid is used, otherwise, the default/001 zid for the meta
>> zone is returned.
>>
>> The webrev is at
>>
>> http://cr.openjdk.java.net/~sherman/jdk8_threeten/ztext_parser
>>
>> The zid->metazone->zid lookup tables is hard coded now in
>> ZoneName. Those mapping tables are generated from the link[2]
>> (which is a result of grep "Link" of all tzdb files), metaZone.xml[1]
>> via the hacky tool MetaZone.java[3]. If this approach is good, I
>> would expect we will get these info from Naoto's TimeZoneUtilities
>> "someday" (I think we are parsing metaZones.xml in cldr tool already,
>> probably for those generic name? we just need to go a little further)
>>
>> The test currently does not fail if the parsed result is not the
>> "expected" one (round trip). The test simply prints out those diff
>> for manual check. If you take a close look at output "result" [4],
>> which shows "zid", "input text", "parsed result" and "expected", it
>> appears all the parsed result is correct/reasonable, I guess the
>> reason those result is not the "expected" is mainly because those
>> zids are not in metaZone.xml.
>>
>> The result[4] shows the parsing works reasonably well for "full"
>> style, but we have lots of "missing/ambiguous" result for those
>> "short" style names. Given the nature of those short names and
>> the limited info (the "locale"), It maybe reasonable to not support
>> those ambiguous short names in parsing? An alternative is to
>> "specify" the mapping table in spec, but it's not going to make
>> every one happy either.
>>
>> Opinion?
>>
>> -Sherman
>>
>> [1] http://cr.openjdk.java.net/~sherman/jdk8_threeten/ztext_parser/metaZones
>> [2] http://cr.openjdk.java.net/~sherman/jdk8_threeten/ztext_parser/link
>> [3]
>> http://cr.openjdk.java.net/~sherman/jdk8_threeten/ztext_parser/MetaZone.java
>> [4] http://cr.openjdk.java.net/~sherman/jdk8_threeten/ztext_parser/result
>>
>>
>> On 01/04/2013 09:07 AM, Stephen Colebourne wrote:
>>> For this release or JDK 9?
>>> We need to ensure that we don't do anything that prevents implementing
>>> the full CLDR strategy if we are not doing it now.
>>> Stephen
>>>
>>> On 4 January 2013 17:03, Xueming Shen<xueming.shen at oracle.com>  wrote:
>>>> On 1/4/13 2:32 AM, Stephen Colebourne wrote:
>>>>> Realy, we should implement the rules described in CLDR, as they seem
>>>>> to have thought about it:
>>>>> http://www.unicode.org/reports/tr35/#Time_Zone_Fallback
>>>>
>>>> we need pull in more cldr data...
>>>>
>>>>
>>>>> Stephen
>>>>



More information about the threeten-dev mailing list