[threeten-dev] zidtext parser
Stephen Colebourne
scolebourne at joda.org
Sun Jan 13 15:02:29 PST 2013
Reading through the whole of
http://www.unicode.org/reports/tr35/#Time_Zone_Fallback (Parsing
section) is a big job that I'm struggling to find time for.
One thought is whether the metazone concept should be exposed to users
in the future, but I don't think it affects the API now. A second
thought is how does this work for new zone IDs added between JDK
releases (as you are hard coding the metazone info - should metazones
be in TZDB.jar?)
A brief read of the CLDR document suggests that we should accept "UT"
as well as "UTC" and "GMT" as prefixes for the offset-based IDs. We
should also accept a number without + as a positive number.
Not sure how you will specify this, or whether it can be left open
enough to expand in the future.
Stephen
On 11 January 2013 02:44, Xueming Shen <xueming.shen at oracle.com> wrote:
> Hi,
>
> One helper class ZoneName, which is based on the metazone data
> info from metaZone.xml [1] and the zone aliase info from the Link
> entries of all the tzdb data files, is added to help implement the
> parsing logic suggested by cldr. The reason we have to have the
> the zone alias info is that it appears metaZone.xml uses both old
> name and new name in different places.
>
> For each zid/zname candidate the implementation now looks up
> the metazone table to see if it has a metazone specified for it, if
> yes, it tries to see if there is a preferred zid defined for a particular
> locale/region, if yes and if this locale/region matches parser's locale,
> that preferred zid is used, otherwise, the default/001 zid for the meta
> zone is returned.
>
> The webrev is at
>
> http://cr.openjdk.java.net/~sherman/jdk8_threeten/ztext_parser
>
> The zid->metazone->zid lookup tables is hard coded now in
> ZoneName. Those mapping tables are generated from the link[2]
> (which is a result of grep "Link" of all tzdb files), metaZone.xml[1]
> via the hacky tool MetaZone.java[3]. If this approach is good, I
> would expect we will get these info from Naoto's TimeZoneUtilities
> "someday" (I think we are parsing metaZones.xml in cldr tool already,
> probably for those generic name? we just need to go a little further)
>
> The test currently does not fail if the parsed result is not the
> "expected" one (round trip). The test simply prints out those diff
> for manual check. If you take a close look at output "result" [4],
> which shows "zid", "input text", "parsed result" and "expected", it
> appears all the parsed result is correct/reasonable, I guess the
> reason those result is not the "expected" is mainly because those
> zids are not in metaZone.xml.
>
> The result[4] shows the parsing works reasonably well for "full"
> style, but we have lots of "missing/ambiguous" result for those
> "short" style names. Given the nature of those short names and
> the limited info (the "locale"), It maybe reasonable to not support
> those ambiguous short names in parsing? An alternative is to
> "specify" the mapping table in spec, but it's not going to make
> every one happy either.
>
> Opinion?
>
> -Sherman
>
> [1] http://cr.openjdk.java.net/~sherman/jdk8_threeten/ztext_parser/metaZones
> [2] http://cr.openjdk.java.net/~sherman/jdk8_threeten/ztext_parser/link
> [3]
> http://cr.openjdk.java.net/~sherman/jdk8_threeten/ztext_parser/MetaZone.java
> [4] http://cr.openjdk.java.net/~sherman/jdk8_threeten/ztext_parser/result
>
>
> On 01/04/2013 09:07 AM, Stephen Colebourne wrote:
>>
>> For this release or JDK 9?
>> We need to ensure that we don't do anything that prevents implementing
>> the full CLDR strategy if we are not doing it now.
>> Stephen
>>
>> On 4 January 2013 17:03, Xueming Shen<xueming.shen at oracle.com> wrote:
>>>
>>> On 1/4/13 2:32 AM, Stephen Colebourne wrote:
>>>>
>>>> Realy, we should implement the rules described in CLDR, as they seem
>>>> to have thought about it:
>>>> http://www.unicode.org/reports/tr35/#Time_Zone_Fallback
>>>
>>>
>>> we need pull in more cldr data...
>>>
>>>
>>>> Stephen
>>>
>>>
>
More information about the threeten-dev
mailing list