<i18n dev> <i18n> Performance patch for DateFormatSymbols.getZoneIndex(String)

Deven You youdwei at linux.vnet.ibm.com
Thu Apr 19 19:56:32 PDT 2012


Yoshito,

It is good to me for your suggestion. So Masayoshi, what is your 
opinion,  is cache the last zone index better than cache recent zone ID 
and zone indexes, do you have any concern about using last zone index?

[1] The updated version of this patch 
http://cr.openjdk.java.net/~youdwei/DateFormatSymbol_perf/webrev.02/ 
<http://cr.openjdk.java.net/%7Eyoudwei/DateFormatSymbol_perf/webrev.02/>

Thanks a lot!

On 04/11/2012 10:35 PM, Yoshito Umaoka wrote:
> Deven,
>
> If last time zone ID is cached along with the previously resolved 
> index, and getZoneIndex does not check the contents of array, it might 
> not work when zone strings are updated (setZoneStrings). So, to make 
> this change work properly, you have to reset lastZoneID/zoneIndex in 
> setZoneStrings(String[][]).
>
> In my opinion, we just need to keep last zone index. And it should be 
> 'transient'.
>
> For every getZoneIndex(String) call, input ID is compared to 
> zoneStrings[lastZoneIndex][0] - and if they matches, simply return 
> lastIndex. Otherwise, do the linear search and update lastZoneIndex- 
> like below.
>
> private transient int lastZoneIndex = 0; // with initial value to be 
> 0, you do not need to check lastZoneIndex >= 0 every time
>
> final int getZoneIndex(String ID) {
>   // fast path
>
>   // note: if you want to get rid of lastZoneIndex < zoneStrings.length,
>   // setZoneStrings(String[][]) should reset lastZoneIndex to 0 or any 
> valid index.
>
>   if (lastZoneIndex < zoneStrings.length && 
> ID.equals(zoneStrings[lastZoneIndex ][0]) {
>     return lastIndex;
>   }
>
>   // slow path
>   for (int index = 0; index < zoneStrings.length; index++) {
>     if (ID.equals(zoneStrings[index][0]) {
>       lastZoneIndex = index;
>       return index;
>     }
>   }
> }
>
> -Yoshito
>
> On 4/11/2012 3:02 AM, Deven You wrote:
>> I think Yoshito's suggestion make sense, since the getZoneIndex is an 
>> internal method, if there is no manual setting to change the 
>> timezone, The timezone ID won't be changed for one instance of 
>> SimpleDateFormt.
>>
>> I updated the patch webrev[1] for this suggestion and test the 
>> GetZoneIndexTest.java, the performance is almost the same with webrev.00.
>>
>> Please review the updated patch[1].
>>
>> Thanks a lot!
>>
>> [1] http://cr.openjdk.java.net/~littlee/OJDK-548/webrev.01 
>> <http://cr.openjdk.java.net/%7Elittlee/OJDK-548/webrev.01>
>>
>> On 04/11/2012 04:00 AM, Yoshito Umaoka wrote:
>>> I'm wondering if caching index matched last time (per 
>>> DateFormatSymbols instance) is sufficient, instead of keeping 
>>> multiple results in a hash map.
>>> I guess majority of use cases repeatedly refer the index of the zone 
>>> used by SimpleDateFormat.
>>>
>>> -Yoshito
>>>
>>> On 4/9/2012 10:46 AM, Masayoshi Okutsu wrote:
>>>> Hi Deven,
>>>>
>>>> Here are my comments on the proposed changes.
>>>>
>>>> - zoneIndexCache should be an instance field because WeakHashMap 
>>>> isn't thread-safe and the order of IDs in zoneStrings differs in 
>>>> each DateFormatSymbols.
>>>>
>>>> - equals shouldn't be replaced with equalsIgnoreCase because time 
>>>> zone IDs are (supposed to be) case-sensitive.
>>>>
>>>> - Utilize auto-boxing/unboxing.
>>>>
>>>> Thanks,
>>>> Masayoshi
>>>>
>>>> On 4/9/2012 12:10 PM, Deven You wrote:
>>>>> Hi i18n-devs,
>>>>>
>>>>> The getZoneIndex() method is expensive and it's performance 
>>>>> depends on which timezone you're in. This is because the code 
>>>>> steps through a list of timezones until it finds the current one. 
>>>>> This is happening over and over again. An idea would be to either 
>>>>> cache it or rewrite the way we store time zone ids, such as a 
>>>>> hashmap instead of an array.
>>>>> This patch[1] is for the cache option.
>>>>>
>>>>> Applications which  format/parse dates using SimpleDateFormat 
>>>>> repeatedly may obtain benefits from this patch, especially when 
>>>>> run in a timezone far down the zoneStrings array the improvements 
>>>>> will be even bigger.
>>>>>
>>>>> I have written a test case[2] which shows when a timezone which is 
>>>>> in lower end of the zone strings array, the performance 
>>>>> improvement this patch can get.
>>>>>
>>>>> test results:
>>>>>
>>>>> no patch:
>>>>> The total time is 742 ms!
>>>>>
>>>>> with this patch[1] :
>>>>> The total time is 508 ms!
>>>>>
>>>>> If increase the loop times to 1000000, the results are:
>>>>>
>>>>> no patch:
>>>>> The total time is 4743 ms!
>>>>>
>>>>> with this patch[1] :
>>>>> The total time is 2126 ms!
>>>>>
>>>>> The java version is:
>>>>>
>>>>> /home/deven/hgrps/jdk8/build/linux-i586/j2sdk-image/bin/java -version
>>>>> openjdk version "1.8.0-internal"
>>>>> OpenJDK Runtime Environment (build 
>>>>> 1.8.0-internal-deven_2012_03_14_16_14-b00)
>>>>> OpenJDK Server VM (build 24.0-b02, mixed mode)
>>>>>
>>>>> [1] 
>>>>> http://cr.openjdk.java.net/~youdwei/DateFormatSymbol_perf/webrev.00/ 
>>>>> <http://cr.openjdk.java.net/%7Eyoudwei/DateFormatSymbol_perf/webrev.00/> 
>>>>>
>>>>> [2] GetZoneIndexTest.java
>>>
>>
>>
>> -- 
>> Best Regards,
>>
>> Deven
>


-- 
Best Regards,

Deven

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/i18n-dev/attachments/20120420/571be039/attachment.html 


More information about the i18n-dev mailing list