From youdwei at linux.vnet.ibm.com Sun Apr 8 20:10:43 2012 From: youdwei at linux.vnet.ibm.com (Deven You) Date: Mon, 09 Apr 2012 11:10:43 +0800 Subject: Performance patch for DateFormatSymbols.getZoneIndex(String) Message-ID: <4F825333.3020306@linux.vnet.ibm.com> Hi i18n-devs, The getZoneIndex() method is expensive and it's performance depends on which timezone you're in. This is because the code steps through a list of timezones until it finds the current one. This is happening over and over again. An idea would be to either cache it or rewrite the way we store time zone ids, such as a hashmap instead of an array. This patch[1] is for the cache option. Applications which format/parse dates using SimpleDateFormat repeatedly may obtain benefits from this patch, especially when run in a timezone far down the zoneStrings array the improvements will be even bigger. I have written a test case[2] which shows when a timezone which is in lower end of the zone strings array, the performance improvement this patch can get. test results: no patch: The total time is 742 ms! with this patch[1] : The total time is 508 ms! If increase the loop times to 1000000, the results are: no patch: The total time is 4743 ms! with this patch[1] : The total time is 2126 ms! The java version is: /home/deven/hgrps/jdk8/build/linux-i586/j2sdk-image/bin/java -version openjdk version "1.8.0-internal" OpenJDK Runtime Environment (build 1.8.0-internal-deven_2012_03_14_16_14-b00) OpenJDK Server VM (build 24.0-b02, mixed mode) [1] http://cr.openjdk.java.net/~youdwei/DateFormatSymbol_perf/webrev.00/ [2] GetZoneIndexTest.java -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/i18n-dev/attachments/20120409/ed59e6c8/attachment.html -------------- next part -------------- A non-text attachment was scrubbed... Name: GetZoneIndexTest.java Type: text/x-java Size: 2450 bytes Desc: not available Url : http://mail.openjdk.java.net/pipermail/i18n-dev/attachments/20120409/ed59e6c8/GetZoneIndexTest.java From masayoshi.okutsu at oracle.com Mon Apr 9 07:46:41 2012 From: masayoshi.okutsu at oracle.com (Masayoshi Okutsu) Date: Mon, 09 Apr 2012 23:46:41 +0900 Subject: Performance patch for DateFormatSymbols.getZoneIndex(String) In-Reply-To: <4F825333.3020306@linux.vnet.ibm.com> References: <4F825333.3020306@linux.vnet.ibm.com> Message-ID: <4F82F651.5090008@oracle.com> Hi Deven, Here are my comments on the proposed changes. - zoneIndexCache should be an instance field because WeakHashMap isn't thread-safe and the order of IDs in zoneStrings differs in each DateFormatSymbols. - equals shouldn't be replaced with equalsIgnoreCase because time zone IDs are (supposed to be) case-sensitive. - Utilize auto-boxing/unboxing. Thanks, Masayoshi On 4/9/2012 12:10 PM, Deven You wrote: > Hi i18n-devs, > > The getZoneIndex() method is expensive and it's performance depends on > which timezone you're in. This is because the code steps through a > list of timezones until it finds the current one. This is happening > over and over again. An idea would be to either cache it or rewrite > the way we store time zone ids, such as a hashmap instead of an array. > This patch[1] is for the cache option. > > Applications which format/parse dates using SimpleDateFormat > repeatedly may obtain benefits from this patch, especially when run in > a timezone far down the zoneStrings array the improvements will be > even bigger. > > I have written a test case[2] which shows when a timezone which is in > lower end of the zone strings array, the performance improvement this > patch can get. > > test results: > > no patch: > The total time is 742 ms! > > with this patch[1] : > The total time is 508 ms! > > If increase the loop times to 1000000, the results are: > > no patch: > The total time is 4743 ms! > > with this patch[1] : > The total time is 2126 ms! > > The java version is: > > /home/deven/hgrps/jdk8/build/linux-i586/j2sdk-image/bin/java -version > openjdk version "1.8.0-internal" > OpenJDK Runtime Environment (build > 1.8.0-internal-deven_2012_03_14_16_14-b00) > OpenJDK Server VM (build 24.0-b02, mixed mode) > > [1] > http://cr.openjdk.java.net/~youdwei/DateFormatSymbol_perf/webrev.00/ > > [2] GetZoneIndexTest.java From y.umaoka at gmail.com Tue Apr 10 13:00:39 2012 From: y.umaoka at gmail.com (Yoshito Umaoka) Date: Tue, 10 Apr 2012 16:00:39 -0400 Subject: Performance patch for DateFormatSymbols.getZoneIndex(String) In-Reply-To: <4F82F651.5090008@oracle.com> References: <4F825333.3020306@linux.vnet.ibm.com> <4F82F651.5090008@oracle.com> Message-ID: <4F849167.1090707@gmail.com> I'm wondering if caching index matched last time (per DateFormatSymbols instance) is sufficient, instead of keeping multiple results in a hash map. I guess majority of use cases repeatedly refer the index of the zone used by SimpleDateFormat. -Yoshito On 4/9/2012 10:46 AM, Masayoshi Okutsu wrote: > Hi Deven, > > Here are my comments on the proposed changes. > > - zoneIndexCache should be an instance field because WeakHashMap isn't > thread-safe and the order of IDs in zoneStrings differs in each > DateFormatSymbols. > > - equals shouldn't be replaced with equalsIgnoreCase because time zone > IDs are (supposed to be) case-sensitive. > > - Utilize auto-boxing/unboxing. > > Thanks, > Masayoshi > > On 4/9/2012 12:10 PM, Deven You wrote: >> Hi i18n-devs, >> >> The getZoneIndex() method is expensive and it's performance depends >> on which timezone you're in. This is because the code steps through a >> list of timezones until it finds the current one. This is happening >> over and over again. An idea would be to either cache it or rewrite >> the way we store time zone ids, such as a hashmap instead of an array. >> This patch[1] is for the cache option. >> >> Applications which format/parse dates using SimpleDateFormat >> repeatedly may obtain benefits from this patch, especially when run >> in a timezone far down the zoneStrings array the improvements will be >> even bigger. >> >> I have written a test case[2] which shows when a timezone which is in >> lower end of the zone strings array, the performance improvement this >> patch can get. >> >> test results: >> >> no patch: >> The total time is 742 ms! >> >> with this patch[1] : >> The total time is 508 ms! >> >> If increase the loop times to 1000000, the results are: >> >> no patch: >> The total time is 4743 ms! >> >> with this patch[1] : >> The total time is 2126 ms! >> >> The java version is: >> >> /home/deven/hgrps/jdk8/build/linux-i586/j2sdk-image/bin/java -version >> openjdk version "1.8.0-internal" >> OpenJDK Runtime Environment (build >> 1.8.0-internal-deven_2012_03_14_16_14-b00) >> OpenJDK Server VM (build 24.0-b02, mixed mode) >> >> [1] >> http://cr.openjdk.java.net/~youdwei/DateFormatSymbol_perf/webrev.00/ >> >> [2] GetZoneIndexTest.java From youdwei at linux.vnet.ibm.com Wed Apr 11 00:02:47 2012 From: youdwei at linux.vnet.ibm.com (Deven You) Date: Wed, 11 Apr 2012 15:02:47 +0800 Subject: Performance patch for DateFormatSymbols.getZoneIndex(String) In-Reply-To: <4F849167.1090707@gmail.com> References: <4F825333.3020306@linux.vnet.ibm.com> <4F82F651.5090008@oracle.com> <4F849167.1090707@gmail.com> Message-ID: <4F852C97.8090603@linux.vnet.ibm.com> I think Yoshito's suggestion make sense, since the getZoneIndex is an internal method, if there is no manual setting to change the timezone, The timezone ID won't be changed for one instance of SimpleDateFormt. I updated the patch webrev[1] for this suggestion and test the GetZoneIndexTest.java, the performance is almost the same with webrev.00. Please review the updated patch[1]. Thanks a lot! [1] http://cr.openjdk.java.net/~littlee/OJDK-548/webrev.01 On 04/11/2012 04:00 AM, Yoshito Umaoka wrote: > I'm wondering if caching index matched last time (per > DateFormatSymbols instance) is sufficient, instead of keeping multiple > results in a hash map. > I guess majority of use cases repeatedly refer the index of the zone > used by SimpleDateFormat. > > -Yoshito > > On 4/9/2012 10:46 AM, Masayoshi Okutsu wrote: >> Hi Deven, >> >> Here are my comments on the proposed changes. >> >> - zoneIndexCache should be an instance field because WeakHashMap >> isn't thread-safe and the order of IDs in zoneStrings differs in each >> DateFormatSymbols. >> >> - equals shouldn't be replaced with equalsIgnoreCase because time >> zone IDs are (supposed to be) case-sensitive. >> >> - Utilize auto-boxing/unboxing. >> >> Thanks, >> Masayoshi >> >> On 4/9/2012 12:10 PM, Deven You wrote: >>> Hi i18n-devs, >>> >>> The getZoneIndex() method is expensive and it's performance depends >>> on which timezone you're in. This is because the code steps through >>> a list of timezones until it finds the current one. This is >>> happening over and over again. An idea would be to either cache it >>> or rewrite the way we store time zone ids, such as a hashmap instead >>> of an array. >>> This patch[1] is for the cache option. >>> >>> Applications which format/parse dates using SimpleDateFormat >>> repeatedly may obtain benefits from this patch, especially when run >>> in a timezone far down the zoneStrings array the improvements will >>> be even bigger. >>> >>> I have written a test case[2] which shows when a timezone which is >>> in lower end of the zone strings array, the performance improvement >>> this patch can get. >>> >>> test results: >>> >>> no patch: >>> The total time is 742 ms! >>> >>> with this patch[1] : >>> The total time is 508 ms! >>> >>> If increase the loop times to 1000000, the results are: >>> >>> no patch: >>> The total time is 4743 ms! >>> >>> with this patch[1] : >>> The total time is 2126 ms! >>> >>> The java version is: >>> >>> /home/deven/hgrps/jdk8/build/linux-i586/j2sdk-image/bin/java -version >>> openjdk version "1.8.0-internal" >>> OpenJDK Runtime Environment (build >>> 1.8.0-internal-deven_2012_03_14_16_14-b00) >>> OpenJDK Server VM (build 24.0-b02, mixed mode) >>> >>> [1] >>> http://cr.openjdk.java.net/~youdwei/DateFormatSymbol_perf/webrev.00/ >>> >>> >>> [2] GetZoneIndexTest.java > -- Best Regards, Deven -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/i18n-dev/attachments/20120411/7698ab51/attachment.html From y.umaoka at gmail.com Wed Apr 11 07:35:10 2012 From: y.umaoka at gmail.com (Yoshito Umaoka) Date: Wed, 11 Apr 2012 10:35:10 -0400 Subject: Performance patch for DateFormatSymbols.getZoneIndex(String) In-Reply-To: <4F852C97.8090603@linux.vnet.ibm.com> References: <4F825333.3020306@linux.vnet.ibm.com> <4F82F651.5090008@oracle.com> <4F849167.1090707@gmail.com> <4F852C97.8090603@linux.vnet.ibm.com> Message-ID: <4F85969E.5040200@gmail.com> Deven, If last time zone ID is cached along with the previously resolved index, and getZoneIndex does not check the contents of array, it might not work when zone strings are updated (setZoneStrings). So, to make this change work properly, you have to reset lastZoneID/zoneIndex in setZoneStrings(String[][]). In my opinion, we just need to keep last zone index. And it should be 'transient'. For every getZoneIndex(String) call, input ID is compared to zoneStrings[lastZoneIndex][0] - and if they matches, simply return lastIndex. Otherwise, do the linear search and update lastZoneIndex- like below. private transient int lastZoneIndex = 0; // with initial value to be 0, you do not need to check lastZoneIndex >= 0 every time final int getZoneIndex(String ID) { // fast path // note: if you want to get rid of lastZoneIndex < zoneStrings.length, // setZoneStrings(String[][]) should reset lastZoneIndex to 0 or any valid index. if (lastZoneIndex < zoneStrings.length && ID.equals(zoneStrings[lastZoneIndex ][0]) { return lastIndex; } // slow path for (int index = 0; index < zoneStrings.length; index++) { if (ID.equals(zoneStrings[index][0]) { lastZoneIndex = index; return index; } } } -Yoshito On 4/11/2012 3:02 AM, Deven You wrote: > I think Yoshito's suggestion make sense, since the getZoneIndex is an > internal method, if there is no manual setting to change the timezone, > The timezone ID won't be changed for one instance of SimpleDateFormt. > > I updated the patch webrev[1] for this suggestion and test the > GetZoneIndexTest.java, the performance is almost the same with webrev.00. > > Please review the updated patch[1]. > > Thanks a lot! > > [1] http://cr.openjdk.java.net/~littlee/OJDK-548/webrev.01 > > > On 04/11/2012 04:00 AM, Yoshito Umaoka wrote: >> I'm wondering if caching index matched last time (per >> DateFormatSymbols instance) is sufficient, instead of keeping >> multiple results in a hash map. >> I guess majority of use cases repeatedly refer the index of the zone >> used by SimpleDateFormat. >> >> -Yoshito >> >> On 4/9/2012 10:46 AM, Masayoshi Okutsu wrote: >>> Hi Deven, >>> >>> Here are my comments on the proposed changes. >>> >>> - zoneIndexCache should be an instance field because WeakHashMap >>> isn't thread-safe and the order of IDs in zoneStrings differs in >>> each DateFormatSymbols. >>> >>> - equals shouldn't be replaced with equalsIgnoreCase because time >>> zone IDs are (supposed to be) case-sensitive. >>> >>> - Utilize auto-boxing/unboxing. >>> >>> Thanks, >>> Masayoshi >>> >>> On 4/9/2012 12:10 PM, Deven You wrote: >>>> Hi i18n-devs, >>>> >>>> The getZoneIndex() method is expensive and it's performance depends >>>> on which timezone you're in. This is because the code steps through >>>> a list of timezones until it finds the current one. This is >>>> happening over and over again. An idea would be to either cache it >>>> or rewrite the way we store time zone ids, such as a hashmap >>>> instead of an array. >>>> This patch[1] is for the cache option. >>>> >>>> Applications which format/parse dates using SimpleDateFormat >>>> repeatedly may obtain benefits from this patch, especially when run >>>> in a timezone far down the zoneStrings array the improvements will >>>> be even bigger. >>>> >>>> I have written a test case[2] which shows when a timezone which is >>>> in lower end of the zone strings array, the performance improvement >>>> this patch can get. >>>> >>>> test results: >>>> >>>> no patch: >>>> The total time is 742 ms! >>>> >>>> with this patch[1] : >>>> The total time is 508 ms! >>>> >>>> If increase the loop times to 1000000, the results are: >>>> >>>> no patch: >>>> The total time is 4743 ms! >>>> >>>> with this patch[1] : >>>> The total time is 2126 ms! >>>> >>>> The java version is: >>>> >>>> /home/deven/hgrps/jdk8/build/linux-i586/j2sdk-image/bin/java -version >>>> openjdk version "1.8.0-internal" >>>> OpenJDK Runtime Environment (build >>>> 1.8.0-internal-deven_2012_03_14_16_14-b00) >>>> OpenJDK Server VM (build 24.0-b02, mixed mode) >>>> >>>> [1] >>>> http://cr.openjdk.java.net/~youdwei/DateFormatSymbol_perf/webrev.00/ >>>> >>>> [2] GetZoneIndexTest.java >> > > > -- > Best Regards, > > Deven -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/i18n-dev/attachments/20120411/24a6f7ef/attachment.html From youdwei at linux.vnet.ibm.com Thu Apr 19 19:56:32 2012 From: youdwei at linux.vnet.ibm.com (Deven You) Date: Fri, 20 Apr 2012 10:56:32 +0800 Subject: Performance patch for DateFormatSymbols.getZoneIndex(String) In-Reply-To: <4F85969E.5040200@gmail.com> References: <4F825333.3020306@linux.vnet.ibm.com> <4F82F651.5090008@oracle.com> <4F849167.1090707@gmail.com> <4F852C97.8090603@linux.vnet.ibm.com> <4F85969E.5040200@gmail.com> Message-ID: <4F90D060.5080800@linux.vnet.ibm.com> Yoshito, It is good to me for your suggestion. So Masayoshi, what is your opinion, is cache the last zone index better than cache recent zone ID and zone indexes, do you have any concern about using last zone index? [1] The updated version of this patch http://cr.openjdk.java.net/~youdwei/DateFormatSymbol_perf/webrev.02/ Thanks a lot! On 04/11/2012 10:35 PM, Yoshito Umaoka wrote: > Deven, > > If last time zone ID is cached along with the previously resolved > index, and getZoneIndex does not check the contents of array, it might > not work when zone strings are updated (setZoneStrings). So, to make > this change work properly, you have to reset lastZoneID/zoneIndex in > setZoneStrings(String[][]). > > In my opinion, we just need to keep last zone index. And it should be > 'transient'. > > For every getZoneIndex(String) call, input ID is compared to > zoneStrings[lastZoneIndex][0] - and if they matches, simply return > lastIndex. Otherwise, do the linear search and update lastZoneIndex- > like below. > > private transient int lastZoneIndex = 0; // with initial value to be > 0, you do not need to check lastZoneIndex >= 0 every time > > final int getZoneIndex(String ID) { > // fast path > > // note: if you want to get rid of lastZoneIndex < zoneStrings.length, > // setZoneStrings(String[][]) should reset lastZoneIndex to 0 or any > valid index. > > if (lastZoneIndex < zoneStrings.length && > ID.equals(zoneStrings[lastZoneIndex ][0]) { > return lastIndex; > } > > // slow path > for (int index = 0; index < zoneStrings.length; index++) { > if (ID.equals(zoneStrings[index][0]) { > lastZoneIndex = index; > return index; > } > } > } > > -Yoshito > > On 4/11/2012 3:02 AM, Deven You wrote: >> I think Yoshito's suggestion make sense, since the getZoneIndex is an >> internal method, if there is no manual setting to change the >> timezone, The timezone ID won't be changed for one instance of >> SimpleDateFormt. >> >> I updated the patch webrev[1] for this suggestion and test the >> GetZoneIndexTest.java, the performance is almost the same with webrev.00. >> >> Please review the updated patch[1]. >> >> Thanks a lot! >> >> [1] http://cr.openjdk.java.net/~littlee/OJDK-548/webrev.01 >> >> >> On 04/11/2012 04:00 AM, Yoshito Umaoka wrote: >>> I'm wondering if caching index matched last time (per >>> DateFormatSymbols instance) is sufficient, instead of keeping >>> multiple results in a hash map. >>> I guess majority of use cases repeatedly refer the index of the zone >>> used by SimpleDateFormat. >>> >>> -Yoshito >>> >>> On 4/9/2012 10:46 AM, Masayoshi Okutsu wrote: >>>> Hi Deven, >>>> >>>> Here are my comments on the proposed changes. >>>> >>>> - zoneIndexCache should be an instance field because WeakHashMap >>>> isn't thread-safe and the order of IDs in zoneStrings differs in >>>> each DateFormatSymbols. >>>> >>>> - equals shouldn't be replaced with equalsIgnoreCase because time >>>> zone IDs are (supposed to be) case-sensitive. >>>> >>>> - Utilize auto-boxing/unboxing. >>>> >>>> Thanks, >>>> Masayoshi >>>> >>>> On 4/9/2012 12:10 PM, Deven You wrote: >>>>> Hi i18n-devs, >>>>> >>>>> The getZoneIndex() method is expensive and it's performance >>>>> depends on which timezone you're in. This is because the code >>>>> steps through a list of timezones until it finds the current one. >>>>> This is happening over and over again. An idea would be to either >>>>> cache it or rewrite the way we store time zone ids, such as a >>>>> hashmap instead of an array. >>>>> This patch[1] is for the cache option. >>>>> >>>>> Applications which format/parse dates using SimpleDateFormat >>>>> repeatedly may obtain benefits from this patch, especially when >>>>> run in a timezone far down the zoneStrings array the improvements >>>>> will be even bigger. >>>>> >>>>> I have written a test case[2] which shows when a timezone which is >>>>> in lower end of the zone strings array, the performance >>>>> improvement this patch can get. >>>>> >>>>> test results: >>>>> >>>>> no patch: >>>>> The total time is 742 ms! >>>>> >>>>> with this patch[1] : >>>>> The total time is 508 ms! >>>>> >>>>> If increase the loop times to 1000000, the results are: >>>>> >>>>> no patch: >>>>> The total time is 4743 ms! >>>>> >>>>> with this patch[1] : >>>>> The total time is 2126 ms! >>>>> >>>>> The java version is: >>>>> >>>>> /home/deven/hgrps/jdk8/build/linux-i586/j2sdk-image/bin/java -version >>>>> openjdk version "1.8.0-internal" >>>>> OpenJDK Runtime Environment (build >>>>> 1.8.0-internal-deven_2012_03_14_16_14-b00) >>>>> OpenJDK Server VM (build 24.0-b02, mixed mode) >>>>> >>>>> [1] >>>>> http://cr.openjdk.java.net/~youdwei/DateFormatSymbol_perf/webrev.00/ >>>>> >>>>> >>>>> [2] GetZoneIndexTest.java >>> >> >> >> -- >> Best Regards, >> >> Deven > -- Best Regards, Deven -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/i18n-dev/attachments/20120420/571be039/attachment.html From masayoshi.okutsu at oracle.com Sun Apr 22 22:19:55 2012 From: masayoshi.okutsu at oracle.com (Masayoshi Okutsu) Date: Mon, 23 Apr 2012 14:19:55 +0900 Subject: Performance patch for DateFormatSymbols.getZoneIndex(String) In-Reply-To: <4F90D060.5080800@linux.vnet.ibm.com> References: <4F825333.3020306@linux.vnet.ibm.com> <4F82F651.5090008@oracle.com> <4F849167.1090707@gmail.com> <4F852C97.8090603@linux.vnet.ibm.com> <4F85969E.5040200@gmail.com> <4F90D060.5080800@linux.vnet.ibm.com> Message-ID: <4F94E67B.3090800@oracle.com> Hi Deven, As Yoshito noted, lastZoneIndex needs to be transient because DateFormatSymbols is serializable. Otherwise, the change looks good to me. Thanks, Masayoshi On 4/20/2012 11:56 AM, Deven You wrote: > Yoshito, > > It is good to me for your suggestion. So Masayoshi, what is your > opinion, is cache the last zone index better than cache recent zone > ID and zone indexes, do you have any concern about using last zone index? > > [1] The updated version of this patch > http://cr.openjdk.java.net/~youdwei/DateFormatSymbol_perf/webrev.02/ > > > Thanks a lot! > > On 04/11/2012 10:35 PM, Yoshito Umaoka wrote: >> Deven, >> >> If last time zone ID is cached along with the previously resolved >> index, and getZoneIndex does not check the contents of array, it >> might not work when zone strings are updated (setZoneStrings). So, to >> make this change work properly, you have to reset >> lastZoneID/zoneIndex in setZoneStrings(String[][]). >> >> In my opinion, we just need to keep last zone index. And it should be >> 'transient'. >> >> For every getZoneIndex(String) call, input ID is compared to >> zoneStrings[lastZoneIndex][0] - and if they matches, simply return >> lastIndex. Otherwise, do the linear search and update lastZoneIndex- >> like below. >> >> private transient int lastZoneIndex = 0; // with initial value to be >> 0, you do not need to check lastZoneIndex >= 0 every time >> >> final int getZoneIndex(String ID) { >> // fast path >> >> // note: if you want to get rid of lastZoneIndex < zoneStrings.length, >> // setZoneStrings(String[][]) should reset lastZoneIndex to 0 or >> any valid index. >> >> if (lastZoneIndex < zoneStrings.length && >> ID.equals(zoneStrings[lastZoneIndex ][0]) { >> return lastIndex; >> } >> >> // slow path >> for (int index = 0; index < zoneStrings.length; index++) { >> if (ID.equals(zoneStrings[index][0]) { >> lastZoneIndex = index; >> return index; >> } >> } >> } >> >> -Yoshito >> >> On 4/11/2012 3:02 AM, Deven You wrote: >>> I think Yoshito's suggestion make sense, since the getZoneIndex is >>> an internal method, if there is no manual setting to change the >>> timezone, The timezone ID won't be changed for one instance of >>> SimpleDateFormt. >>> >>> I updated the patch webrev[1] for this suggestion and test the >>> GetZoneIndexTest.java, the performance is almost the same with >>> webrev.00. >>> >>> Please review the updated patch[1]. >>> >>> Thanks a lot! >>> >>> [1] http://cr.openjdk.java.net/~littlee/OJDK-548/webrev.01 >>> >>> >>> On 04/11/2012 04:00 AM, Yoshito Umaoka wrote: >>>> I'm wondering if caching index matched last time (per >>>> DateFormatSymbols instance) is sufficient, instead of keeping >>>> multiple results in a hash map. >>>> I guess majority of use cases repeatedly refer the index of the >>>> zone used by SimpleDateFormat. >>>> >>>> -Yoshito >>>> >>>> On 4/9/2012 10:46 AM, Masayoshi Okutsu wrote: >>>>> Hi Deven, >>>>> >>>>> Here are my comments on the proposed changes. >>>>> >>>>> - zoneIndexCache should be an instance field because WeakHashMap >>>>> isn't thread-safe and the order of IDs in zoneStrings differs in >>>>> each DateFormatSymbols. >>>>> >>>>> - equals shouldn't be replaced with equalsIgnoreCase because time >>>>> zone IDs are (supposed to be) case-sensitive. >>>>> >>>>> - Utilize auto-boxing/unboxing. >>>>> >>>>> Thanks, >>>>> Masayoshi >>>>> >>>>> On 4/9/2012 12:10 PM, Deven You wrote: >>>>>> Hi i18n-devs, >>>>>> >>>>>> The getZoneIndex() method is expensive and it's performance >>>>>> depends on which timezone you're in. This is because the code >>>>>> steps through a list of timezones until it finds the current one. >>>>>> This is happening over and over again. An idea would be to either >>>>>> cache it or rewrite the way we store time zone ids, such as a >>>>>> hashmap instead of an array. >>>>>> This patch[1] is for the cache option. >>>>>> >>>>>> Applications which format/parse dates using SimpleDateFormat >>>>>> repeatedly may obtain benefits from this patch, especially when >>>>>> run in a timezone far down the zoneStrings array the improvements >>>>>> will be even bigger. >>>>>> >>>>>> I have written a test case[2] which shows when a timezone which >>>>>> is in lower end of the zone strings array, the performance >>>>>> improvement this patch can get. >>>>>> >>>>>> test results: >>>>>> >>>>>> no patch: >>>>>> The total time is 742 ms! >>>>>> >>>>>> with this patch[1] : >>>>>> The total time is 508 ms! >>>>>> >>>>>> If increase the loop times to 1000000, the results are: >>>>>> >>>>>> no patch: >>>>>> The total time is 4743 ms! >>>>>> >>>>>> with this patch[1] : >>>>>> The total time is 2126 ms! >>>>>> >>>>>> The java version is: >>>>>> >>>>>> /home/deven/hgrps/jdk8/build/linux-i586/j2sdk-image/bin/java >>>>>> -version >>>>>> openjdk version "1.8.0-internal" >>>>>> OpenJDK Runtime Environment (build >>>>>> 1.8.0-internal-deven_2012_03_14_16_14-b00) >>>>>> OpenJDK Server VM (build 24.0-b02, mixed mode) >>>>>> >>>>>> [1] >>>>>> http://cr.openjdk.java.net/~youdwei/DateFormatSymbol_perf/webrev.00/ >>>>>> >>>>>> >>>>>> [2] GetZoneIndexTest.java >>>> >>> >>> >>> -- >>> Best Regards, >>> >>> Deven >> > > > -- > Best Regards, > > Deven From youdwei at linux.vnet.ibm.com Mon Apr 23 01:51:30 2012 From: youdwei at linux.vnet.ibm.com (Deven You) Date: Mon, 23 Apr 2012 16:51:30 +0800 Subject: Performance patch for DateFormatSymbols.getZoneIndex(String) In-Reply-To: <4F94E67B.3090800@oracle.com> References: <4F825333.3020306@linux.vnet.ibm.com> <4F82F651.5090008@oracle.com> <4F849167.1090707@gmail.com> <4F852C97.8090603@linux.vnet.ibm.com> <4F85969E.5040200@gmail.com> <4F90D060.5080800@linux.vnet.ibm.com> <4F94E67B.3090800@oracle.com> Message-ID: <4F951812.70906@linux.vnet.ibm.com> Hi Masayoshi, Thanks for your reminder. I have updated the webrev[1]. Please review it. [1] http://cr.openjdk.java.net/~littlee/OJDK-548/webrev.03 Thanks a lot! On 04/23/2012 01:19 PM, Masayoshi Okutsu wrote: > Hi Deven, > > As Yoshito noted, lastZoneIndex needs to be transient because > DateFormatSymbols is serializable. Otherwise, the change looks good to > me. > > Thanks, > Masayoshi > > On 4/20/2012 11:56 AM, Deven You wrote: >> Yoshito, >> >> It is good to me for your suggestion. So Masayoshi, what is your >> opinion, is cache the last zone index better than cache recent zone >> ID and zone indexes, do you have any concern about using last zone >> index? >> >> [1] The updated version of this patch >> http://cr.openjdk.java.net/~youdwei/DateFormatSymbol_perf/webrev.02/ >> >> >> Thanks a lot! >> >> On 04/11/2012 10:35 PM, Yoshito Umaoka wrote: >>> Deven, >>> >>> If last time zone ID is cached along with the previously resolved >>> index, and getZoneIndex does not check the contents of array, it >>> might not work when zone strings are updated (setZoneStrings). So, >>> to make this change work properly, you have to reset >>> lastZoneID/zoneIndex in setZoneStrings(String[][]). >>> >>> In my opinion, we just need to keep last zone index. And it should >>> be 'transient'. >>> >>> For every getZoneIndex(String) call, input ID is compared to >>> zoneStrings[lastZoneIndex][0] - and if they matches, simply return >>> lastIndex. Otherwise, do the linear search and update lastZoneIndex- >>> like below. >>> >>> private transient int lastZoneIndex = 0; // with initial value to be >>> 0, you do not need to check lastZoneIndex >= 0 every time >>> >>> final int getZoneIndex(String ID) { >>> // fast path >>> >>> // note: if you want to get rid of lastZoneIndex < >>> zoneStrings.length, >>> // setZoneStrings(String[][]) should reset lastZoneIndex to 0 or >>> any valid index. >>> >>> if (lastZoneIndex < zoneStrings.length && >>> ID.equals(zoneStrings[lastZoneIndex ][0]) { >>> return lastIndex; >>> } >>> >>> // slow path >>> for (int index = 0; index < zoneStrings.length; index++) { >>> if (ID.equals(zoneStrings[index][0]) { >>> lastZoneIndex = index; >>> return index; >>> } >>> } >>> } >>> >>> -Yoshito >>> >>> On 4/11/2012 3:02 AM, Deven You wrote: >>>> I think Yoshito's suggestion make sense, since the getZoneIndex is >>>> an internal method, if there is no manual setting to change the >>>> timezone, The timezone ID won't be changed for one instance of >>>> SimpleDateFormt. >>>> >>>> I updated the patch webrev[1] for this suggestion and test the >>>> GetZoneIndexTest.java, the performance is almost the same with >>>> webrev.00. >>>> >>>> Please review the updated patch[1]. >>>> >>>> Thanks a lot! >>>> >>>> [1] http://cr.openjdk.java.net/~littlee/OJDK-548/webrev.01 >>>> >>>> >>>> On 04/11/2012 04:00 AM, Yoshito Umaoka wrote: >>>>> I'm wondering if caching index matched last time (per >>>>> DateFormatSymbols instance) is sufficient, instead of keeping >>>>> multiple results in a hash map. >>>>> I guess majority of use cases repeatedly refer the index of the >>>>> zone used by SimpleDateFormat. >>>>> >>>>> -Yoshito >>>>> >>>>> On 4/9/2012 10:46 AM, Masayoshi Okutsu wrote: >>>>>> Hi Deven, >>>>>> >>>>>> Here are my comments on the proposed changes. >>>>>> >>>>>> - zoneIndexCache should be an instance field because WeakHashMap >>>>>> isn't thread-safe and the order of IDs in zoneStrings differs in >>>>>> each DateFormatSymbols. >>>>>> >>>>>> - equals shouldn't be replaced with equalsIgnoreCase because time >>>>>> zone IDs are (supposed to be) case-sensitive. >>>>>> >>>>>> - Utilize auto-boxing/unboxing. >>>>>> >>>>>> Thanks, >>>>>> Masayoshi >>>>>> >>>>>> On 4/9/2012 12:10 PM, Deven You wrote: >>>>>>> Hi i18n-devs, >>>>>>> >>>>>>> The getZoneIndex() method is expensive and it's performance >>>>>>> depends on which timezone you're in. This is because the code >>>>>>> steps through a list of timezones until it finds the current >>>>>>> one. This is happening over and over again. An idea would be to >>>>>>> either cache it or rewrite the way we store time zone ids, such >>>>>>> as a hashmap instead of an array. >>>>>>> This patch[1] is for the cache option. >>>>>>> >>>>>>> Applications which format/parse dates using SimpleDateFormat >>>>>>> repeatedly may obtain benefits from this patch, especially when >>>>>>> run in a timezone far down the zoneStrings array the >>>>>>> improvements will be even bigger. >>>>>>> >>>>>>> I have written a test case[2] which shows when a timezone which >>>>>>> is in lower end of the zone strings array, the performance >>>>>>> improvement this patch can get. >>>>>>> >>>>>>> test results: >>>>>>> >>>>>>> no patch: >>>>>>> The total time is 742 ms! >>>>>>> >>>>>>> with this patch[1] : >>>>>>> The total time is 508 ms! >>>>>>> >>>>>>> If increase the loop times to 1000000, the results are: >>>>>>> >>>>>>> no patch: >>>>>>> The total time is 4743 ms! >>>>>>> >>>>>>> with this patch[1] : >>>>>>> The total time is 2126 ms! >>>>>>> >>>>>>> The java version is: >>>>>>> >>>>>>> /home/deven/hgrps/jdk8/build/linux-i586/j2sdk-image/bin/java >>>>>>> -version >>>>>>> openjdk version "1.8.0-internal" >>>>>>> OpenJDK Runtime Environment (build >>>>>>> 1.8.0-internal-deven_2012_03_14_16_14-b00) >>>>>>> OpenJDK Server VM (build 24.0-b02, mixed mode) >>>>>>> >>>>>>> [1] >>>>>>> http://cr.openjdk.java.net/~youdwei/DateFormatSymbol_perf/webrev.00/ >>>>>>> >>>>>>> >>>>>>> [2] GetZoneIndexTest.java >>>>> >>>> >>>> >>>> -- >>>> Best Regards, >>>> >>>> Deven >>> >> >> >> -- >> Best Regards, >> >> Deven > -- Best Regards, Deven -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/i18n-dev/attachments/20120423/2bf57f9a/attachment.html From masayoshi.okutsu at oracle.com Mon Apr 23 01:54:19 2012 From: masayoshi.okutsu at oracle.com (Masayoshi Okutsu) Date: Mon, 23 Apr 2012 17:54:19 +0900 Subject: Performance patch for DateFormatSymbols.getZoneIndex(String) In-Reply-To: <4F951812.70906@linux.vnet.ibm.com> References: <4F825333.3020306@linux.vnet.ibm.com> <4F82F651.5090008@oracle.com> <4F849167.1090707@gmail.com> <4F852C97.8090603@linux.vnet.ibm.com> <4F85969E.5040200@gmail.com> <4F90D060.5080800@linux.vnet.ibm.com> <4F94E67B.3090800@oracle.com> <4F951812.70906@linux.vnet.ibm.com> Message-ID: <4F9518BB.2020607@oracle.com> Looks good! Masayoshi On 4/23/2012 5:51 PM, Deven You wrote: > Hi Masayoshi, > > Thanks for your reminder. I have updated the webrev[1]. Please review it. > > [1] http://cr.openjdk.java.net/~littlee/OJDK-548/webrev.03 > > > Thanks a lot! > On 04/23/2012 01:19 PM, Masayoshi Okutsu wrote: >> Hi Deven, >> >> As Yoshito noted, lastZoneIndex needs to be transient because >> DateFormatSymbols is serializable. Otherwise, the change looks good >> to me. >> >> Thanks, >> Masayoshi >> >> On 4/20/2012 11:56 AM, Deven You wrote: >>> Yoshito, >>> >>> It is good to me for your suggestion. So Masayoshi, what is your >>> opinion, is cache the last zone index better than cache recent zone >>> ID and zone indexes, do you have any concern about using last zone >>> index? >>> >>> [1] The updated version of this patch >>> http://cr.openjdk.java.net/~youdwei/DateFormatSymbol_perf/webrev.02/ >>> >>> >>> >>> Thanks a lot! >>> >>> On 04/11/2012 10:35 PM, Yoshito Umaoka wrote: >>>> Deven, >>>> >>>> If last time zone ID is cached along with the previously resolved >>>> index, and getZoneIndex does not check the contents of array, it >>>> might not work when zone strings are updated (setZoneStrings). So, >>>> to make this change work properly, you have to reset >>>> lastZoneID/zoneIndex in setZoneStrings(String[][]). >>>> >>>> In my opinion, we just need to keep last zone index. And it should >>>> be 'transient'. >>>> >>>> For every getZoneIndex(String) call, input ID is compared to >>>> zoneStrings[lastZoneIndex][0] - and if they matches, simply return >>>> lastIndex. Otherwise, do the linear search and update >>>> lastZoneIndex- like below. >>>> >>>> private transient int lastZoneIndex = 0; // with initial value to >>>> be 0, you do not need to check lastZoneIndex >= 0 every time >>>> >>>> final int getZoneIndex(String ID) { >>>> // fast path >>>> >>>> // note: if you want to get rid of lastZoneIndex < >>>> zoneStrings.length, >>>> // setZoneStrings(String[][]) should reset lastZoneIndex to 0 or >>>> any valid index. >>>> >>>> if (lastZoneIndex < zoneStrings.length && >>>> ID.equals(zoneStrings[lastZoneIndex ][0]) { >>>> return lastIndex; >>>> } >>>> >>>> // slow path >>>> for (int index = 0; index < zoneStrings.length; index++) { >>>> if (ID.equals(zoneStrings[index][0]) { >>>> lastZoneIndex = index; >>>> return index; >>>> } >>>> } >>>> } >>>> >>>> -Yoshito >>>> >>>> On 4/11/2012 3:02 AM, Deven You wrote: >>>>> I think Yoshito's suggestion make sense, since the getZoneIndex is >>>>> an internal method, if there is no manual setting to change the >>>>> timezone, The timezone ID won't be changed for one instance of >>>>> SimpleDateFormt. >>>>> >>>>> I updated the patch webrev[1] for this suggestion and test the >>>>> GetZoneIndexTest.java, the performance is almost the same with >>>>> webrev.00. >>>>> >>>>> Please review the updated patch[1]. >>>>> >>>>> Thanks a lot! >>>>> >>>>> [1] http://cr.openjdk.java.net/~littlee/OJDK-548/webrev.01 >>>>> >>>>> >>>>> On 04/11/2012 04:00 AM, Yoshito Umaoka wrote: >>>>>> I'm wondering if caching index matched last time (per >>>>>> DateFormatSymbols instance) is sufficient, instead of keeping >>>>>> multiple results in a hash map. >>>>>> I guess majority of use cases repeatedly refer the index of the >>>>>> zone used by SimpleDateFormat. >>>>>> >>>>>> -Yoshito >>>>>> >>>>>> On 4/9/2012 10:46 AM, Masayoshi Okutsu wrote: >>>>>>> Hi Deven, >>>>>>> >>>>>>> Here are my comments on the proposed changes. >>>>>>> >>>>>>> - zoneIndexCache should be an instance field because WeakHashMap >>>>>>> isn't thread-safe and the order of IDs in zoneStrings differs in >>>>>>> each DateFormatSymbols. >>>>>>> >>>>>>> - equals shouldn't be replaced with equalsIgnoreCase because >>>>>>> time zone IDs are (supposed to be) case-sensitive. >>>>>>> >>>>>>> - Utilize auto-boxing/unboxing. >>>>>>> >>>>>>> Thanks, >>>>>>> Masayoshi >>>>>>> >>>>>>> On 4/9/2012 12:10 PM, Deven You wrote: >>>>>>>> Hi i18n-devs, >>>>>>>> >>>>>>>> The getZoneIndex() method is expensive and it's performance >>>>>>>> depends on which timezone you're in. This is because the code >>>>>>>> steps through a list of timezones until it finds the current >>>>>>>> one. This is happening over and over again. An idea would be to >>>>>>>> either cache it or rewrite the way we store time zone ids, such >>>>>>>> as a hashmap instead of an array. >>>>>>>> This patch[1] is for the cache option. >>>>>>>> >>>>>>>> Applications which format/parse dates using SimpleDateFormat >>>>>>>> repeatedly may obtain benefits from this patch, especially when >>>>>>>> run in a timezone far down the zoneStrings array the >>>>>>>> improvements will be even bigger. >>>>>>>> >>>>>>>> I have written a test case[2] which shows when a timezone which >>>>>>>> is in lower end of the zone strings array, the performance >>>>>>>> improvement this patch can get. >>>>>>>> >>>>>>>> test results: >>>>>>>> >>>>>>>> no patch: >>>>>>>> The total time is 742 ms! >>>>>>>> >>>>>>>> with this patch[1] : >>>>>>>> The total time is 508 ms! >>>>>>>> >>>>>>>> If increase the loop times to 1000000, the results are: >>>>>>>> >>>>>>>> no patch: >>>>>>>> The total time is 4743 ms! >>>>>>>> >>>>>>>> with this patch[1] : >>>>>>>> The total time is 2126 ms! >>>>>>>> >>>>>>>> The java version is: >>>>>>>> >>>>>>>> /home/deven/hgrps/jdk8/build/linux-i586/j2sdk-image/bin/java >>>>>>>> -version >>>>>>>> openjdk version "1.8.0-internal" >>>>>>>> OpenJDK Runtime Environment (build >>>>>>>> 1.8.0-internal-deven_2012_03_14_16_14-b00) >>>>>>>> OpenJDK Server VM (build 24.0-b02, mixed mode) >>>>>>>> >>>>>>>> [1] >>>>>>>> http://cr.openjdk.java.net/~youdwei/DateFormatSymbol_perf/webrev.00/ >>>>>>>> >>>>>>>> >>>>>>>> [2] GetZoneIndexTest.java >>>>>> >>>>> >>>>> >>>>> -- >>>>> Best Regards, >>>>> >>>>> Deven >>>> >>> >>> >>> -- >>> Best Regards, >>> >>> Deven >> > > > -- > Best Regards, > > Deven From littlee at linux.vnet.ibm.com Mon Apr 23 02:01:36 2012 From: littlee at linux.vnet.ibm.com (Charles Lee) Date: Mon, 23 Apr 2012 17:01:36 +0800 Subject: Performance patch for DateFormatSymbols.getZoneIndex(String) In-Reply-To: <4F9518BB.2020607@oracle.com> References: <4F825333.3020306@linux.vnet.ibm.com> <4F82F651.5090008@oracle.com> <4F849167.1090707@gmail.com> <4F852C97.8090603@linux.vnet.ibm.com> <4F85969E.5040200@gmail.com> <4F90D060.5080800@linux.vnet.ibm.com> <4F94E67B.3090800@oracle.com> <4F951812.70906@linux.vnet.ibm.com> <4F9518BB.2020607@oracle.com> Message-ID: <4F951A70.1010506@linux.vnet.ibm.com> Hi Masayoshi, I'd like to sponsor this patch. But I did not figure out which repos the patch should go to. Is it jdk8-tl repository? On 04/23/2012 04:54 PM, Masayoshi Okutsu wrote: > Looks good! > > Masayoshi > > On 4/23/2012 5:51 PM, Deven You wrote: >> Hi Masayoshi, >> >> Thanks for your reminder. I have updated the webrev[1]. Please review >> it. >> >> [1] http://cr.openjdk.java.net/~littlee/OJDK-548/webrev.03 >> >> >> Thanks a lot! >> On 04/23/2012 01:19 PM, Masayoshi Okutsu wrote: >>> Hi Deven, >>> >>> As Yoshito noted, lastZoneIndex needs to be transient because >>> DateFormatSymbols is serializable. Otherwise, the change looks good >>> to me. >>> >>> Thanks, >>> Masayoshi >>> >>> On 4/20/2012 11:56 AM, Deven You wrote: >>>> Yoshito, >>>> >>>> It is good to me for your suggestion. So Masayoshi, what is your >>>> opinion, is cache the last zone index better than cache recent >>>> zone ID and zone indexes, do you have any concern about using last >>>> zone index? >>>> >>>> [1] The updated version of this patch >>>> http://cr.openjdk.java.net/~youdwei/DateFormatSymbol_perf/webrev.02/ >>>> >>>> >>>> Thanks a lot! >>>> >>>> On 04/11/2012 10:35 PM, Yoshito Umaoka wrote: >>>>> Deven, >>>>> >>>>> If last time zone ID is cached along with the previously resolved >>>>> index, and getZoneIndex does not check the contents of array, it >>>>> might not work when zone strings are updated (setZoneStrings). So, >>>>> to make this change work properly, you have to reset >>>>> lastZoneID/zoneIndex in setZoneStrings(String[][]). >>>>> >>>>> In my opinion, we just need to keep last zone index. And it should >>>>> be 'transient'. >>>>> >>>>> For every getZoneIndex(String) call, input ID is compared to >>>>> zoneStrings[lastZoneIndex][0] - and if they matches, simply return >>>>> lastIndex. Otherwise, do the linear search and update >>>>> lastZoneIndex- like below. >>>>> >>>>> private transient int lastZoneIndex = 0; // with initial value to >>>>> be 0, you do not need to check lastZoneIndex >= 0 every time >>>>> >>>>> final int getZoneIndex(String ID) { >>>>> // fast path >>>>> >>>>> // note: if you want to get rid of lastZoneIndex < >>>>> zoneStrings.length, >>>>> // setZoneStrings(String[][]) should reset lastZoneIndex to 0 or >>>>> any valid index. >>>>> >>>>> if (lastZoneIndex < zoneStrings.length && >>>>> ID.equals(zoneStrings[lastZoneIndex ][0]) { >>>>> return lastIndex; >>>>> } >>>>> >>>>> // slow path >>>>> for (int index = 0; index < zoneStrings.length; index++) { >>>>> if (ID.equals(zoneStrings[index][0]) { >>>>> lastZoneIndex = index; >>>>> return index; >>>>> } >>>>> } >>>>> } >>>>> >>>>> -Yoshito >>>>> >>>>> On 4/11/2012 3:02 AM, Deven You wrote: >>>>>> I think Yoshito's suggestion make sense, since the getZoneIndex >>>>>> is an internal method, if there is no manual setting to change >>>>>> the timezone, The timezone ID won't be changed for one instance >>>>>> of SimpleDateFormt. >>>>>> >>>>>> I updated the patch webrev[1] for this suggestion and test the >>>>>> GetZoneIndexTest.java, the performance is almost the same with >>>>>> webrev.00. >>>>>> >>>>>> Please review the updated patch[1]. >>>>>> >>>>>> Thanks a lot! >>>>>> >>>>>> [1] http://cr.openjdk.java.net/~littlee/OJDK-548/webrev.01 >>>>>> >>>>>> >>>>>> On 04/11/2012 04:00 AM, Yoshito Umaoka wrote: >>>>>>> I'm wondering if caching index matched last time (per >>>>>>> DateFormatSymbols instance) is sufficient, instead of keeping >>>>>>> multiple results in a hash map. >>>>>>> I guess majority of use cases repeatedly refer the index of the >>>>>>> zone used by SimpleDateFormat. >>>>>>> >>>>>>> -Yoshito >>>>>>> >>>>>>> On 4/9/2012 10:46 AM, Masayoshi Okutsu wrote: >>>>>>>> Hi Deven, >>>>>>>> >>>>>>>> Here are my comments on the proposed changes. >>>>>>>> >>>>>>>> - zoneIndexCache should be an instance field because >>>>>>>> WeakHashMap isn't thread-safe and the order of IDs in >>>>>>>> zoneStrings differs in each DateFormatSymbols. >>>>>>>> >>>>>>>> - equals shouldn't be replaced with equalsIgnoreCase because >>>>>>>> time zone IDs are (supposed to be) case-sensitive. >>>>>>>> >>>>>>>> - Utilize auto-boxing/unboxing. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Masayoshi >>>>>>>> >>>>>>>> On 4/9/2012 12:10 PM, Deven You wrote: >>>>>>>>> Hi i18n-devs, >>>>>>>>> >>>>>>>>> The getZoneIndex() method is expensive and it's performance >>>>>>>>> depends on which timezone you're in. This is because the code >>>>>>>>> steps through a list of timezones until it finds the current >>>>>>>>> one. This is happening over and over again. An idea would be >>>>>>>>> to either cache it or rewrite the way we store time zone ids, >>>>>>>>> such as a hashmap instead of an array. >>>>>>>>> This patch[1] is for the cache option. >>>>>>>>> >>>>>>>>> Applications which format/parse dates using SimpleDateFormat >>>>>>>>> repeatedly may obtain benefits from this patch, especially >>>>>>>>> when run in a timezone far down the zoneStrings array the >>>>>>>>> improvements will be even bigger. >>>>>>>>> >>>>>>>>> I have written a test case[2] which shows when a timezone >>>>>>>>> which is in lower end of the zone strings array, the >>>>>>>>> performance improvement this patch can get. >>>>>>>>> >>>>>>>>> test results: >>>>>>>>> >>>>>>>>> no patch: >>>>>>>>> The total time is 742 ms! >>>>>>>>> >>>>>>>>> with this patch[1] : >>>>>>>>> The total time is 508 ms! >>>>>>>>> >>>>>>>>> If increase the loop times to 1000000, the results are: >>>>>>>>> >>>>>>>>> no patch: >>>>>>>>> The total time is 4743 ms! >>>>>>>>> >>>>>>>>> with this patch[1] : >>>>>>>>> The total time is 2126 ms! >>>>>>>>> >>>>>>>>> The java version is: >>>>>>>>> >>>>>>>>> /home/deven/hgrps/jdk8/build/linux-i586/j2sdk-image/bin/java >>>>>>>>> -version >>>>>>>>> openjdk version "1.8.0-internal" >>>>>>>>> OpenJDK Runtime Environment (build >>>>>>>>> 1.8.0-internal-deven_2012_03_14_16_14-b00) >>>>>>>>> OpenJDK Server VM (build 24.0-b02, mixed mode) >>>>>>>>> >>>>>>>>> [1] >>>>>>>>> http://cr.openjdk.java.net/~youdwei/DateFormatSymbol_perf/webrev.00/ >>>>>>>>> >>>>>>>>> >>>>>>>>> [2] GetZoneIndexTest.java >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Best Regards, >>>>>> >>>>>> Deven >>>>> >>>> >>>> >>>> -- >>>> Best Regards, >>>> >>>> Deven >>> >> >> >> -- >> Best Regards, >> >> Deven > -- Yours Charles From masayoshi.okutsu at oracle.com Tue Apr 24 01:54:05 2012 From: masayoshi.okutsu at oracle.com (Masayoshi Okutsu) Date: Tue, 24 Apr 2012 17:54:05 +0900 Subject: Performance patch for DateFormatSymbols.getZoneIndex(String) In-Reply-To: <4F951A70.1010506@linux.vnet.ibm.com> References: <4F825333.3020306@linux.vnet.ibm.com> <4F82F651.5090008@oracle.com> <4F849167.1090707@gmail.com> <4F852C97.8090603@linux.vnet.ibm.com> <4F85969E.5040200@gmail.com> <4F90D060.5080800@linux.vnet.ibm.com> <4F94E67B.3090800@oracle.com> <4F951812.70906@linux.vnet.ibm.com> <4F9518BB.2020607@oracle.com> <4F951A70.1010506@linux.vnet.ibm.com> Message-ID: <4F966A2D.8050807@oracle.com> Hi Charles, Yes, it's jdk8-tl. Thanks, Masayoshi On 4/23/2012 6:01 PM, Charles Lee wrote: > Hi Masayoshi, > > I'd like to sponsor this patch. But I did not figure out which repos > the patch should go to. Is it jdk8-tl repository? > > On 04/23/2012 04:54 PM, Masayoshi Okutsu wrote: >> Looks good! >> >> Masayoshi >> >> On 4/23/2012 5:51 PM, Deven You wrote: >>> Hi Masayoshi, >>> >>> Thanks for your reminder. I have updated the webrev[1]. Please >>> review it. >>> >>> [1] http://cr.openjdk.java.net/~littlee/OJDK-548/webrev.03 >>> >>> >>> Thanks a lot! >>> On 04/23/2012 01:19 PM, Masayoshi Okutsu wrote: >>>> Hi Deven, >>>> >>>> As Yoshito noted, lastZoneIndex needs to be transient because >>>> DateFormatSymbols is serializable. Otherwise, the change looks good >>>> to me. >>>> >>>> Thanks, >>>> Masayoshi >>>> >>>> On 4/20/2012 11:56 AM, Deven You wrote: >>>>> Yoshito, >>>>> >>>>> It is good to me for your suggestion. So Masayoshi, what is your >>>>> opinion, is cache the last zone index better than cache recent >>>>> zone ID and zone indexes, do you have any concern about using last >>>>> zone index? >>>>> >>>>> [1] The updated version of this patch >>>>> http://cr.openjdk.java.net/~youdwei/DateFormatSymbol_perf/webrev.02/ >>>>> >>>>> >>>>> >>>>> Thanks a lot! >>>>> >>>>> On 04/11/2012 10:35 PM, Yoshito Umaoka wrote: >>>>>> Deven, >>>>>> >>>>>> If last time zone ID is cached along with the previously resolved >>>>>> index, and getZoneIndex does not check the contents of array, it >>>>>> might not work when zone strings are updated (setZoneStrings). >>>>>> So, to make this change work properly, you have to reset >>>>>> lastZoneID/zoneIndex in setZoneStrings(String[][]). >>>>>> >>>>>> In my opinion, we just need to keep last zone index. And it >>>>>> should be 'transient'. >>>>>> >>>>>> For every getZoneIndex(String) call, input ID is compared to >>>>>> zoneStrings[lastZoneIndex][0] - and if they matches, simply >>>>>> return lastIndex. Otherwise, do the linear search and update >>>>>> lastZoneIndex- like below. >>>>>> >>>>>> private transient int lastZoneIndex = 0; // with initial value to >>>>>> be 0, you do not need to check lastZoneIndex >= 0 every time >>>>>> >>>>>> final int getZoneIndex(String ID) { >>>>>> // fast path >>>>>> >>>>>> // note: if you want to get rid of lastZoneIndex < >>>>>> zoneStrings.length, >>>>>> // setZoneStrings(String[][]) should reset lastZoneIndex to 0 >>>>>> or any valid index. >>>>>> >>>>>> if (lastZoneIndex < zoneStrings.length && >>>>>> ID.equals(zoneStrings[lastZoneIndex ][0]) { >>>>>> return lastIndex; >>>>>> } >>>>>> >>>>>> // slow path >>>>>> for (int index = 0; index < zoneStrings.length; index++) { >>>>>> if (ID.equals(zoneStrings[index][0]) { >>>>>> lastZoneIndex = index; >>>>>> return index; >>>>>> } >>>>>> } >>>>>> } >>>>>> >>>>>> -Yoshito >>>>>> >>>>>> On 4/11/2012 3:02 AM, Deven You wrote: >>>>>>> I think Yoshito's suggestion make sense, since the getZoneIndex >>>>>>> is an internal method, if there is no manual setting to change >>>>>>> the timezone, The timezone ID won't be changed for one instance >>>>>>> of SimpleDateFormt. >>>>>>> >>>>>>> I updated the patch webrev[1] for this suggestion and test the >>>>>>> GetZoneIndexTest.java, the performance is almost the same with >>>>>>> webrev.00. >>>>>>> >>>>>>> Please review the updated patch[1]. >>>>>>> >>>>>>> Thanks a lot! >>>>>>> >>>>>>> [1] http://cr.openjdk.java.net/~littlee/OJDK-548/webrev.01 >>>>>>> >>>>>>> >>>>>>> On 04/11/2012 04:00 AM, Yoshito Umaoka wrote: >>>>>>>> I'm wondering if caching index matched last time (per >>>>>>>> DateFormatSymbols instance) is sufficient, instead of keeping >>>>>>>> multiple results in a hash map. >>>>>>>> I guess majority of use cases repeatedly refer the index of the >>>>>>>> zone used by SimpleDateFormat. >>>>>>>> >>>>>>>> -Yoshito >>>>>>>> >>>>>>>> On 4/9/2012 10:46 AM, Masayoshi Okutsu wrote: >>>>>>>>> Hi Deven, >>>>>>>>> >>>>>>>>> Here are my comments on the proposed changes. >>>>>>>>> >>>>>>>>> - zoneIndexCache should be an instance field because >>>>>>>>> WeakHashMap isn't thread-safe and the order of IDs in >>>>>>>>> zoneStrings differs in each DateFormatSymbols. >>>>>>>>> >>>>>>>>> - equals shouldn't be replaced with equalsIgnoreCase because >>>>>>>>> time zone IDs are (supposed to be) case-sensitive. >>>>>>>>> >>>>>>>>> - Utilize auto-boxing/unboxing. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Masayoshi >>>>>>>>> >>>>>>>>> On 4/9/2012 12:10 PM, Deven You wrote: >>>>>>>>>> Hi i18n-devs, >>>>>>>>>> >>>>>>>>>> The getZoneIndex() method is expensive and it's performance >>>>>>>>>> depends on which timezone you're in. This is because the code >>>>>>>>>> steps through a list of timezones until it finds the current >>>>>>>>>> one. This is happening over and over again. An idea would be >>>>>>>>>> to either cache it or rewrite the way we store time zone ids, >>>>>>>>>> such as a hashmap instead of an array. >>>>>>>>>> This patch[1] is for the cache option. >>>>>>>>>> >>>>>>>>>> Applications which format/parse dates using SimpleDateFormat >>>>>>>>>> repeatedly may obtain benefits from this patch, especially >>>>>>>>>> when run in a timezone far down the zoneStrings array the >>>>>>>>>> improvements will be even bigger. >>>>>>>>>> >>>>>>>>>> I have written a test case[2] which shows when a timezone >>>>>>>>>> which is in lower end of the zone strings array, the >>>>>>>>>> performance improvement this patch can get. >>>>>>>>>> >>>>>>>>>> test results: >>>>>>>>>> >>>>>>>>>> no patch: >>>>>>>>>> The total time is 742 ms! >>>>>>>>>> >>>>>>>>>> with this patch[1] : >>>>>>>>>> The total time is 508 ms! >>>>>>>>>> >>>>>>>>>> If increase the loop times to 1000000, the results are: >>>>>>>>>> >>>>>>>>>> no patch: >>>>>>>>>> The total time is 4743 ms! >>>>>>>>>> >>>>>>>>>> with this patch[1] : >>>>>>>>>> The total time is 2126 ms! >>>>>>>>>> >>>>>>>>>> The java version is: >>>>>>>>>> >>>>>>>>>> /home/deven/hgrps/jdk8/build/linux-i586/j2sdk-image/bin/java >>>>>>>>>> -version >>>>>>>>>> openjdk version "1.8.0-internal" >>>>>>>>>> OpenJDK Runtime Environment (build >>>>>>>>>> 1.8.0-internal-deven_2012_03_14_16_14-b00) >>>>>>>>>> OpenJDK Server VM (build 24.0-b02, mixed mode) >>>>>>>>>> >>>>>>>>>> [1] >>>>>>>>>> http://cr.openjdk.java.net/~youdwei/DateFormatSymbol_perf/webrev.00/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> [2] GetZoneIndexTest.java >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Best Regards, >>>>>>> >>>>>>> Deven >>>>>> >>>>> >>>>> >>>>> -- >>>>> Best Regards, >>>>> >>>>> Deven >>>> >>> >>> >>> -- >>> Best Regards, >>> >>> Deven >> > > From littlee at linux.vnet.ibm.com Tue Apr 24 06:09:50 2012 From: littlee at linux.vnet.ibm.com (Charles Lee) Date: Tue, 24 Apr 2012 21:09:50 +0800 Subject: Performance patch for DateFormatSymbols.getZoneIndex(String) In-Reply-To: <4F951812.70906@linux.vnet.ibm.com> References: <4F825333.3020306@linux.vnet.ibm.com> <4F82F651.5090008@oracle.com> <4F849167.1090707@gmail.com> <4F852C97.8090603@linux.vnet.ibm.com> <4F85969E.5040200@gmail.com> <4F90D060.5080800@linux.vnet.ibm.com> <4F94E67B.3090800@oracle.com> <4F951812.70906@linux.vnet.ibm.com> Message-ID: <4F96A61E.2010407@linux.vnet.ibm.com> Hi Deven, The patch has been committed @ Changeset: 2c35304e885a Author: youdwei Date: 2012-04-24 21:06 +0800 URL:http://hg.openjdk.java.net/jdk8/tl/jdk/rev/2c35304e885a 7163865: Performance improvement for DateFormatSymbols.getZoneIndex(String) Reviewed-by: okutsu Please verify it. Hi Masayoshi, Thank you for reviewing it. On 04/23/2012 04:51 PM, Deven You wrote: > Hi Masayoshi, > > Thanks for your reminder. I have updated the webrev[1]. Please review it. > > [1] http://cr.openjdk.java.net/~littlee/OJDK-548/webrev.03 > > > Thanks a lot! > On 04/23/2012 01:19 PM, Masayoshi Okutsu wrote: >> Hi Deven, >> >> As Yoshito noted, lastZoneIndex needs to be transient because >> DateFormatSymbols is serializable. Otherwise, the change looks good >> to me. >> >> Thanks, >> Masayoshi >> >> On 4/20/2012 11:56 AM, Deven You wrote: >>> Yoshito, >>> >>> It is good to me for your suggestion. So Masayoshi, what is your >>> opinion, is cache the last zone index better than cache recent zone >>> ID and zone indexes, do you have any concern about using last zone >>> index? >>> >>> [1] The updated version of this patch >>> http://cr.openjdk.java.net/~youdwei/DateFormatSymbol_perf/webrev.02/ >>> >>> >>> >>> Thanks a lot! >>> >>> On 04/11/2012 10:35 PM, Yoshito Umaoka wrote: >>>> Deven, >>>> >>>> If last time zone ID is cached along with the previously resolved >>>> index, and getZoneIndex does not check the contents of array, it >>>> might not work when zone strings are updated (setZoneStrings). So, >>>> to make this change work properly, you have to reset >>>> lastZoneID/zoneIndex in setZoneStrings(String[][]). >>>> >>>> In my opinion, we just need to keep last zone index. And it should >>>> be 'transient'. >>>> >>>> For every getZoneIndex(String) call, input ID is compared to >>>> zoneStrings[lastZoneIndex][0] - and if they matches, simply return >>>> lastIndex. Otherwise, do the linear search and update >>>> lastZoneIndex- like below. >>>> >>>> private transient int lastZoneIndex = 0; // with initial value to >>>> be 0, you do not need to check lastZoneIndex >= 0 every time >>>> >>>> final int getZoneIndex(String ID) { >>>> // fast path >>>> >>>> // note: if you want to get rid of lastZoneIndex < >>>> zoneStrings.length, >>>> // setZoneStrings(String[][]) should reset lastZoneIndex to 0 or >>>> any valid index. >>>> >>>> if (lastZoneIndex < zoneStrings.length && >>>> ID.equals(zoneStrings[lastZoneIndex ][0]) { >>>> return lastIndex; >>>> } >>>> >>>> // slow path >>>> for (int index = 0; index < zoneStrings.length; index++) { >>>> if (ID.equals(zoneStrings[index][0]) { >>>> lastZoneIndex = index; >>>> return index; >>>> } >>>> } >>>> } >>>> >>>> -Yoshito >>>> >>>> On 4/11/2012 3:02 AM, Deven You wrote: >>>>> I think Yoshito's suggestion make sense, since the getZoneIndex is >>>>> an internal method, if there is no manual setting to change the >>>>> timezone, The timezone ID won't be changed for one instance of >>>>> SimpleDateFormt. >>>>> >>>>> I updated the patch webrev[1] for this suggestion and test the >>>>> GetZoneIndexTest.java, the performance is almost the same with >>>>> webrev.00. >>>>> >>>>> Please review the updated patch[1]. >>>>> >>>>> Thanks a lot! >>>>> >>>>> [1] http://cr.openjdk.java.net/~littlee/OJDK-548/webrev.01 >>>>> >>>>> >>>>> On 04/11/2012 04:00 AM, Yoshito Umaoka wrote: >>>>>> I'm wondering if caching index matched last time (per >>>>>> DateFormatSymbols instance) is sufficient, instead of keeping >>>>>> multiple results in a hash map. >>>>>> I guess majority of use cases repeatedly refer the index of the >>>>>> zone used by SimpleDateFormat. >>>>>> >>>>>> -Yoshito >>>>>> >>>>>> On 4/9/2012 10:46 AM, Masayoshi Okutsu wrote: >>>>>>> Hi Deven, >>>>>>> >>>>>>> Here are my comments on the proposed changes. >>>>>>> >>>>>>> - zoneIndexCache should be an instance field because WeakHashMap >>>>>>> isn't thread-safe and the order of IDs in zoneStrings differs in >>>>>>> each DateFormatSymbols. >>>>>>> >>>>>>> - equals shouldn't be replaced with equalsIgnoreCase because >>>>>>> time zone IDs are (supposed to be) case-sensitive. >>>>>>> >>>>>>> - Utilize auto-boxing/unboxing. >>>>>>> >>>>>>> Thanks, >>>>>>> Masayoshi >>>>>>> >>>>>>> On 4/9/2012 12:10 PM, Deven You wrote: >>>>>>>> Hi i18n-devs, >>>>>>>> >>>>>>>> The getZoneIndex() method is expensive and it's performance >>>>>>>> depends on which timezone you're in. This is because the code >>>>>>>> steps through a list of timezones until it finds the current >>>>>>>> one. This is happening over and over again. An idea would be to >>>>>>>> either cache it or rewrite the way we store time zone ids, such >>>>>>>> as a hashmap instead of an array. >>>>>>>> This patch[1] is for the cache option. >>>>>>>> >>>>>>>> Applications which format/parse dates using SimpleDateFormat >>>>>>>> repeatedly may obtain benefits from this patch, especially when >>>>>>>> run in a timezone far down the zoneStrings array the >>>>>>>> improvements will be even bigger. >>>>>>>> >>>>>>>> I have written a test case[2] which shows when a timezone which >>>>>>>> is in lower end of the zone strings array, the performance >>>>>>>> improvement this patch can get. >>>>>>>> >>>>>>>> test results: >>>>>>>> >>>>>>>> no patch: >>>>>>>> The total time is 742 ms! >>>>>>>> >>>>>>>> with this patch[1] : >>>>>>>> The total time is 508 ms! >>>>>>>> >>>>>>>> If increase the loop times to 1000000, the results are: >>>>>>>> >>>>>>>> no patch: >>>>>>>> The total time is 4743 ms! >>>>>>>> >>>>>>>> with this patch[1] : >>>>>>>> The total time is 2126 ms! >>>>>>>> >>>>>>>> The java version is: >>>>>>>> >>>>>>>> /home/deven/hgrps/jdk8/build/linux-i586/j2sdk-image/bin/java >>>>>>>> -version >>>>>>>> openjdk version "1.8.0-internal" >>>>>>>> OpenJDK Runtime Environment (build >>>>>>>> 1.8.0-internal-deven_2012_03_14_16_14-b00) >>>>>>>> OpenJDK Server VM (build 24.0-b02, mixed mode) >>>>>>>> >>>>>>>> [1] >>>>>>>> http://cr.openjdk.java.net/~youdwei/DateFormatSymbol_perf/webrev.00/ >>>>>>>> >>>>>>>> >>>>>>>> [2] GetZoneIndexTest.java >>>>>> >>>>> >>>>> >>>>> -- >>>>> Best Regards, >>>>> >>>>> Deven >>>> >>> >>> >>> -- >>> Best Regards, >>> >>> Deven >> > > > -- > Best Regards, > > Deven -- Yours Charles -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/i18n-dev/attachments/20120424/cbc911aa/attachment-0001.html