<i18n dev> <i18n> Performance patch for DateFormatSymbols.getZoneIndex(String)

Deven You youdwei at linux.vnet.ibm.com
Mon Apr 23 01:51:30 PDT 2012


Hi Masayoshi,

Thanks for your reminder. I have updated the webrev[1]. Please review it.

[1] http://cr.openjdk.java.net/~littlee/OJDK-548/webrev.03 
<http://cr.openjdk.java.net/%7Elittlee/OJDK-548/webrev.03>

Thanks a lot!
On 04/23/2012 01:19 PM, Masayoshi Okutsu wrote:
> Hi Deven,
>
> As Yoshito noted, lastZoneIndex needs to be transient because 
> DateFormatSymbols is serializable. Otherwise, the change looks good to 
> me.
>
> Thanks,
> Masayoshi
>
> On 4/20/2012 11:56 AM, Deven You wrote:
>> Yoshito,
>>
>> It is good to me for your suggestion. So Masayoshi, what is your 
>> opinion,  is cache the last zone index better than cache recent zone 
>> ID and zone indexes, do you have any concern about using last zone 
>> index?
>>
>> [1] The updated version of this patch 
>> http://cr.openjdk.java.net/~youdwei/DateFormatSymbol_perf/webrev.02/ 
>> <http://cr.openjdk.java.net/%7Eyoudwei/DateFormatSymbol_perf/webrev.02/>
>>
>> Thanks a lot!
>>
>> On 04/11/2012 10:35 PM, Yoshito Umaoka wrote:
>>> Deven,
>>>
>>> If last time zone ID is cached along with the previously resolved 
>>> index, and getZoneIndex does not check the contents of array, it 
>>> might not work when zone strings are updated (setZoneStrings). So, 
>>> to make this change work properly, you have to reset 
>>> lastZoneID/zoneIndex in setZoneStrings(String[][]).
>>>
>>> In my opinion, we just need to keep last zone index. And it should 
>>> be 'transient'.
>>>
>>> For every getZoneIndex(String) call, input ID is compared to 
>>> zoneStrings[lastZoneIndex][0] - and if they matches, simply return 
>>> lastIndex. Otherwise, do the linear search and update lastZoneIndex- 
>>> like below.
>>>
>>> private transient int lastZoneIndex = 0; // with initial value to be 
>>> 0, you do not need to check lastZoneIndex >= 0 every time
>>>
>>> final int getZoneIndex(String ID) {
>>>   // fast path
>>>
>>>   // note: if you want to get rid of lastZoneIndex < 
>>> zoneStrings.length,
>>>   // setZoneStrings(String[][]) should reset lastZoneIndex to 0 or 
>>> any valid index.
>>>
>>>   if (lastZoneIndex < zoneStrings.length && 
>>> ID.equals(zoneStrings[lastZoneIndex ][0]) {
>>>     return lastIndex;
>>>   }
>>>
>>>   // slow path
>>>   for (int index = 0; index < zoneStrings.length; index++) {
>>>     if (ID.equals(zoneStrings[index][0]) {
>>>       lastZoneIndex = index;
>>>       return index;
>>>     }
>>>   }
>>> }
>>>
>>> -Yoshito
>>>
>>> On 4/11/2012 3:02 AM, Deven You wrote:
>>>> I think Yoshito's suggestion make sense, since the getZoneIndex is 
>>>> an internal method, if there is no manual setting to change the 
>>>> timezone, The timezone ID won't be changed for one instance of 
>>>> SimpleDateFormt.
>>>>
>>>> I updated the patch webrev[1] for this suggestion and test the 
>>>> GetZoneIndexTest.java, the performance is almost the same with 
>>>> webrev.00.
>>>>
>>>> Please review the updated patch[1].
>>>>
>>>> Thanks a lot!
>>>>
>>>> [1] http://cr.openjdk.java.net/~littlee/OJDK-548/webrev.01 
>>>> <http://cr.openjdk.java.net/%7Elittlee/OJDK-548/webrev.01>
>>>>
>>>> On 04/11/2012 04:00 AM, Yoshito Umaoka wrote:
>>>>> I'm wondering if caching index matched last time (per 
>>>>> DateFormatSymbols instance) is sufficient, instead of keeping 
>>>>> multiple results in a hash map.
>>>>> I guess majority of use cases repeatedly refer the index of the 
>>>>> zone used by SimpleDateFormat.
>>>>>
>>>>> -Yoshito
>>>>>
>>>>> On 4/9/2012 10:46 AM, Masayoshi Okutsu wrote:
>>>>>> Hi Deven,
>>>>>>
>>>>>> Here are my comments on the proposed changes.
>>>>>>
>>>>>> - zoneIndexCache should be an instance field because WeakHashMap 
>>>>>> isn't thread-safe and the order of IDs in zoneStrings differs in 
>>>>>> each DateFormatSymbols.
>>>>>>
>>>>>> - equals shouldn't be replaced with equalsIgnoreCase because time 
>>>>>> zone IDs are (supposed to be) case-sensitive.
>>>>>>
>>>>>> - Utilize auto-boxing/unboxing.
>>>>>>
>>>>>> Thanks,
>>>>>> Masayoshi
>>>>>>
>>>>>> On 4/9/2012 12:10 PM, Deven You wrote:
>>>>>>> Hi i18n-devs,
>>>>>>>
>>>>>>> The getZoneIndex() method is expensive and it's performance 
>>>>>>> depends on which timezone you're in. This is because the code 
>>>>>>> steps through a list of timezones until it finds the current 
>>>>>>> one. This is happening over and over again. An idea would be to 
>>>>>>> either cache it or rewrite the way we store time zone ids, such 
>>>>>>> as a hashmap instead of an array.
>>>>>>> This patch[1] is for the cache option.
>>>>>>>
>>>>>>> Applications which  format/parse dates using SimpleDateFormat 
>>>>>>> repeatedly may obtain benefits from this patch, especially when 
>>>>>>> run in a timezone far down the zoneStrings array the 
>>>>>>> improvements will be even bigger.
>>>>>>>
>>>>>>> I have written a test case[2] which shows when a timezone which 
>>>>>>> is in lower end of the zone strings array, the performance 
>>>>>>> improvement this patch can get.
>>>>>>>
>>>>>>> test results:
>>>>>>>
>>>>>>> no patch:
>>>>>>> The total time is 742 ms!
>>>>>>>
>>>>>>> with this patch[1] :
>>>>>>> The total time is 508 ms!
>>>>>>>
>>>>>>> If increase the loop times to 1000000, the results are:
>>>>>>>
>>>>>>> no patch:
>>>>>>> The total time is 4743 ms!
>>>>>>>
>>>>>>> with this patch[1] :
>>>>>>> The total time is 2126 ms!
>>>>>>>
>>>>>>> The java version is:
>>>>>>>
>>>>>>> /home/deven/hgrps/jdk8/build/linux-i586/j2sdk-image/bin/java 
>>>>>>> -version
>>>>>>> openjdk version "1.8.0-internal"
>>>>>>> OpenJDK Runtime Environment (build 
>>>>>>> 1.8.0-internal-deven_2012_03_14_16_14-b00)
>>>>>>> OpenJDK Server VM (build 24.0-b02, mixed mode)
>>>>>>>
>>>>>>> [1] 
>>>>>>> http://cr.openjdk.java.net/~youdwei/DateFormatSymbol_perf/webrev.00/ 
>>>>>>> <http://cr.openjdk.java.net/%7Eyoudwei/DateFormatSymbol_perf/webrev.00/> 
>>>>>>>
>>>>>>> [2] GetZoneIndexTest.java
>>>>>
>>>>
>>>>
>>>> -- 
>>>> Best Regards,
>>>>
>>>> Deven
>>>
>>
>>
>> -- 
>> Best Regards,
>>
>> Deven
>


-- 
Best Regards,

Deven

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/i18n-dev/attachments/20120423/2bf57f9a/attachment.html 


More information about the i18n-dev mailing list