[threeten-dev] zidtext parser
Xueming Shen
xueming.shen at oracle.com
Sun Jan 13 15:00:42 PST 2013
Opinions, comments? Should we check this in for M6?
webrev has been updated recently to more reasonable.
http://cr.openjdk.java.net/~sherman/jdk8_threeten/ztext_parser
-Sherman
On 1/10/13 6:44 PM, Xueming Shen wrote:
> Hi,
>
> One helper class ZoneName, which is based on the metazone data
> info from metaZone.xml [1] and the zone aliase info from the Link
> entries of all the tzdb data files, is added to help implement the
> parsing logic suggested by cldr. The reason we have to have the
> the zone alias info is that it appears metaZone.xml uses both old
> name and new name in different places.
>
> For each zid/zname candidate the implementation now looks up
> the metazone table to see if it has a metazone specified for it, if
> yes, it tries to see if there is a preferred zid defined for a particular
> locale/region, if yes and if this locale/region matches parser's locale,
> that preferred zid is used, otherwise, the default/001 zid for the meta
> zone is returned.
>
> The webrev is at
>
> http://cr.openjdk.java.net/~sherman/jdk8_threeten/ztext_parser
>
> The zid->metazone->zid lookup tables is hard coded now in
> ZoneName. Those mapping tables are generated from the link[2]
> (which is a result of grep "Link" of all tzdb files), metaZone.xml[1]
> via the hacky tool MetaZone.java[3]. If this approach is good, I
> would expect we will get these info from Naoto's TimeZoneUtilities
> "someday" (I think we are parsing metaZones.xml in cldr tool already,
> probably for those generic name? we just need to go a little further)
>
> The test currently does not fail if the parsed result is not the
> "expected" one (round trip). The test simply prints out those diff
> for manual check. If you take a close look at output "result" [4],
> which shows "zid", "input text", "parsed result" and "expected", it
> appears all the parsed result is correct/reasonable, I guess the
> reason those result is not the "expected" is mainly because those
> zids are not in metaZone.xml.
>
> The result[4] shows the parsing works reasonably well for "full"
> style, but we have lots of "missing/ambiguous" result for those
> "short" style names. Given the nature of those short names and
> the limited info (the "locale"), It maybe reasonable to not support
> those ambiguous short names in parsing? An alternative is to
> "specify" the mapping table in spec, but it's not going to make
> every one happy either.
>
> Opinion?
>
> -Sherman
>
> [1]
> http://cr.openjdk.java.net/~sherman/jdk8_threeten/ztext_parser/metaZones
> [2] http://cr.openjdk.java.net/~sherman/jdk8_threeten/ztext_parser/link
> [3]
> http://cr.openjdk.java.net/~sherman/jdk8_threeten/ztext_parser/MetaZone.java
> [4] http://cr.openjdk.java.net/~sherman/jdk8_threeten/ztext_parser/result
>
> On 01/04/2013 09:07 AM, Stephen Colebourne wrote:
>> For this release or JDK 9?
>> We need to ensure that we don't do anything that prevents implementing
>> the full CLDR strategy if we are not doing it now.
>> Stephen
>>
>> On 4 January 2013 17:03, Xueming Shen<xueming.shen at oracle.com> wrote:
>>> On 1/4/13 2:32 AM, Stephen Colebourne wrote:
>>>> Realy, we should implement the rules described in CLDR, as they seem
>>>> to have thought about it:
>>>> http://www.unicode.org/reports/tr35/#Time_Zone_Fallback
>>>
>>> we need pull in more cldr data...
>>>
>>>
>>>> Stephen
>>>
>
More information about the threeten-dev
mailing list