From kishor.kharbas at intel.com  Thu Sep  1 05:17:04 2016
From: kishor.kharbas at intel.com (Kharbas, Kishor)
Date: Thu, 1 Sep 2016 05:17:04 +0000
Subject: RFR(M) 8078122 : YMM registers upper 128 bits may get clobbered
	by a JNI call on windows
In-Reply-To: <F89640DCD01A85489FCBA68183A6A0F38EB8AC2E@ORSMSX116.amr.corp.intel.com>
References: <F89640DCD01A85489FCBA68183A6A0F38EB8A9A7@ORSMSX116.amr.corp.intel.com>
	<57BE1AD4.7070403@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F38EB8AC2E@ORSMSX116.amr.corp.intel.com>
Message-ID: <F89640DCD01A85489FCBA68183A6A0F38EB91338@ORSMSX116.amr.corp.intel.com>

Hello,

I removed the unwanted save and restore of registers in the range XMM6-XMM31 from the x64_64 stubs. 
I also removed the #ifdef _WIN64 block from x86.ad file.

Link to the new patch : http://cr.openjdk.java.net/~vdeshpande/8078122/webrev.01/

Thanks
Kishor


-----Original Message-----
From: Kharbas, Kishor 
Sent: Wednesday, August 24, 2016 6:24 PM
To: Vladimir Kozlov <vladimir.kozlov at oracle.com>; hotspot-compiler-dev at openjdk.java.net
Cc: Kharbas, Kishor <kishor.kharbas at intel.com>
Subject: RE: RFR(M) 8078122 : YMM registers upper 128 bits may get clobbered by a JNI call on windows

Thanks Vladimir for quick feedback.
I will look into the stubs which save the registers in the range XMM6-XMM31. Also the first comment makes perfect sense.

Thanks
Kishor

-----Original Message-----
From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] 
Sent: Wednesday, August 24, 2016 3:08 PM
To: Kharbas, Kishor <kishor.kharbas at intel.com>; hotspot-compiler-dev at openjdk.java.net
Subject: Re: RFR(M) 8078122 : YMM registers upper 128 bits may get clobbered by a JNI call on windows

Hi Kishor,

First, #ifdef _WIN64 is not needed anymore since calling convention is similat to unix now.

Second, I would like you to look more broadly. With this change we don't need to preserve XMM6-XMM31 in our stubs for WIN64. I am not sure that we can remove all #ifdef _WIN64 there but for most of them I think we can do. Please, look.

Thanks,
Vladimir

On 8/24/16 2:40 PM, Kharbas, Kishor wrote:
> Requesting the community to review the patch for 
> https://bugs.openjdk.java.net/browse/JDK-8078122
>
> Webrev : http://cr.openjdk.java.net/~vdeshpande/8078122/webrev.00
>
> The patch changes the definitions of registers XMM6-XMM31 for WIN64.
>
> Thank you.
>
> Kishor
>

From dmitrij.pochepko at oracle.com  Thu Sep  1 14:28:57 2016
From: dmitrij.pochepko at oracle.com (Dmitrij Pochepko)
Date: Thu, 1 Sep 2016 17:28:57 +0300
Subject: RFR(S): 8146096 - [TEST BUG]
	compiler/loopopts/UseCountedLoopSafepoints.java Timeouts
In-Reply-To: <6e2b1074-0573-b078-3ade-1d028e5fb377@oracle.com>
References: <4d957fd9-442a-9096-ac01-0f39a974f6a2@oracle.com>
	<91e6de05-de11-d026-aa25-127775169479@oracle.com>
	<a28530eb-3262-126e-0732-a730ea7fae22@oracle.com>
	<6e2b1074-0573-b078-3ade-1d028e5fb377@oracle.com>
Message-ID: <20b0d0de-c992-e7e3-dff4-601cc6ab03f5@oracle.com>

Thank you for attentive review.


> Wow! You wrote "parser" to C2 Ideal graph.
>
> // now, find SafePoint->CountedLoopEnd edge
>
> Actually CountedLoopEnd  input edge should point to SafePoint. Not 
> reverse. You should search from LoopEnd up.
fixed
>
>  115    SafePoint    ===  112  1  107  1  1  10  110  [[ 116 ]] 
> SafePoint !orig=76 !jvms: SimpleTest::testMethod @ bci:21
>  116    CountedLoopEnd    ===  115  105  [[ 117  118 ]] [lt] 
> P=0.500000, C=6633.000000 !orig=97,[80] !jvms: SimpleTest::testMethod 
> @ bci:5
>
> The test is too simple. Without UseCountedLoopSafepoints the loop is 
> folded to accum += 100. You need a little more complex.
I've changed increment to combination of different shifts
>
> Also test should be run only with C2 (Tiered should be off). I don't 
> think require vm.opt.TieredStopAtLevel is the same as 
> -XX:-TieredCompilation.
Requires expression just help to ensure level 4 is available, because 
testMethod specifically compiled on level=4 using WhiteBox 
(CompilerWhiteBoxTest.COMP_LEVEL_FULL_OPTIMIZATION).
This differs from original test idea to disable tiered compilation and 
trigger compilation by running cycle 2 billion times, which made it a 
long "stress" test. Now that method compiled via WhiteBox, this test 
became much more faster

Please take a look at v03: 
http://cr.openjdk.java.net/~dpochepk/8146096/webrev.03/

Thanks,
Dmitrij
>
> Thanks,
> Vladimir
>
> On 8/31/16 12:03 PM, Dmitrij Pochepko wrote:
>> Hi,
>>
>> Please take a look at v02.
>>
>> I've rewritten this test. Now it launch vm with
>> -XX:+UseCountedLoopSafepoints (restricting compilation to tested method
>> with single simple counted loop) and checks that output of
>> -XX:+PrintIdeal have edge SafePoint -> CountedLoopEnd
>>
>> Then, launch the same with -XX:-UseCountedLoopSafepoints and checks that
>> there is no such edge.
>>
>> I've tested fix via rbt on all platforms.
>>
>> webrev: http://cr.openjdk.java.net/~dpochepk/8146096/webrev.02/
>>
>>
>> Thanks,
>>
>> Dmitrij
>>
>> On 29.08.2016 19:28, Vladimir Kozlov wrote:
>>> I am not against marking test as stress but I think the test itself is
>>> not good. It should be rewrote. I added comment to the JBS with
>>> discussion during original 8146096 RFR.
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 8/29/16 7:08 AM, Dmitrij Pochepko wrote:
>>>> Hi,
>>>>
>>>> please review small fix for 8146096 - [TEST BUG]
>>>> compiler/loopopts/UseCountedLoopSafepoints.java Timeouts
>>>>
>>>>
>>>> Test timeouts on slow platforms, so, this fix adds execution control
>>>> with respect to elapsed time. Also, test marked as stress.
>>>>
>>>> webrev: http://cr.openjdk.java.net/~dpochepk/8146096/webrev.01/
>>>>
>>>> CR: https://bugs.openjdk.java.net/browse/JDK-8146096
>>>>
>>>>
>>>> I've tested this fix on linux-amd64.
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Dmitrij
>>>>
>>


From vladimir.x.ivanov at oracle.com  Thu Sep  1 15:35:32 2016
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Thu, 1 Sep 2016 18:35:32 +0300
Subject: [9] RFR(XS): 8165050: [jittester] generated tests cannot be run
	with jtreg
In-Reply-To: <3e682080-a02c-d2ab-2827-e4233a49082f@oracle.com>
References: <3e682080-a02c-d2ab-2827-e4233a49082f@oracle.com>
Message-ID: <e9f9baae-4b19-32b5-c627-428db5213fda@oracle.com>

Reviewed.

Best regards,
Vladimir Ivanov

On 8/31/16 8:14 PM, Tatiana Pivovarova wrote:
> Hello!
>
> Please review this small patch
>
> Bug:
> After moving jdk.test.lib and next fixing of jittester building
> (8164648) JTREG cannot find jdk.test.lib library to compile generated test
> Fix:
> After considering several approaches we decided to return the previous
> approach - just copy jdk.test.lib in $TESTBASE folder
>
> Tested locally
>
> bug: https://bugs.openjdk.java.net/browse/JDK-8165050
> webrev: http://cr.openjdk.java.net/~tpivovarova/8165050/webrev.00/
>
> Thanks,
> Tatiana
>

From dmitrij.pochepko at oracle.com  Thu Sep  1 17:00:13 2016
From: dmitrij.pochepko at oracle.com (Dmitrij Pochepko)
Date: Thu, 1 Sep 2016 20:00:13 +0300
Subject: RFR(XS): 8165244 - Unquarantine
	compiler/jvmci/compilerToVM/ExecuteInstalledCodeTest.java
Message-ID: <fa4f43c6-fd16-f6e1-3269-59ad3327bdb6@oracle.com>

Hi,

please review small patch for 8165244 - Unquarantine 
compiler/jvmci/compilerToVM/ExecuteInstalledCodeTest.java

A quarantine reason (JDK-8139383) is closed as duplicate of fixed 
JDK-8139700, so, 
compiler/jvmci/compilerToVM/ExecuteInstalledCodeTest.java should be 
unquarantined.

webrev: http://cr.openjdk.java.net/~dpochepk/8165244/webrev.01/

CR: https://bugs.openjdk.java.net/browse/JDK-8165244

I've tested fix on linux-amd64.

Thanks,

Dmitrij


From tatiana.pivovarova at oracle.com  Thu Sep  1 17:05:26 2016
From: tatiana.pivovarova at oracle.com (Tatiana Pivovarova)
Date: Thu, 1 Sep 2016 20:05:26 +0300
Subject: [9] RFR(XS): 8165050: [jittester] generated tests cannot be run
	with jtreg
In-Reply-To: <e9f9baae-4b19-32b5-c627-428db5213fda@oracle.com>
References: <3e682080-a02c-d2ab-2827-e4233a49082f@oracle.com>
	<e9f9baae-4b19-32b5-c627-428db5213fda@oracle.com>
Message-ID: <ab60f697-5794-6c13-b3ef-684029232585@oracle.com>

Thanks Vladimir!


On 09/01/2016 06:35 PM, Vladimir Ivanov wrote:
> Reviewed.
>
> Best regards,
> Vladimir Ivanov
>
> On 8/31/16 8:14 PM, Tatiana Pivovarova wrote:
>> Hello!
>>
>> Please review this small patch
>>
>> Bug:
>> After moving jdk.test.lib and next fixing of jittester building
>> (8164648) JTREG cannot find jdk.test.lib library to compile generated 
>> test
>> Fix:
>> After considering several approaches we decided to return the previous
>> approach - just copy jdk.test.lib in $TESTBASE folder
>>
>> Tested locally
>>
>> bug: https://bugs.openjdk.java.net/browse/JDK-8165050
>> webrev: http://cr.openjdk.java.net/~tpivovarova/8165050/webrev.00/
>>
>> Thanks,
>> Tatiana
>>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160901/9d00eb26/attachment.html>

From vladimir.kozlov at oracle.com  Thu Sep  1 17:36:15 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 1 Sep 2016 10:36:15 -0700
Subject: RFR(XS): 8165244 - Unquarantine
	compiler/jvmci/compilerToVM/ExecuteInstalledCodeTest.java
In-Reply-To: <fa4f43c6-fd16-f6e1-3269-59ad3327bdb6@oracle.com>
References: <fa4f43c6-fd16-f6e1-3269-59ad3327bdb6@oracle.com>
Message-ID: <caea79ca-7a5f-604b-11fc-8f2785a394ac@oracle.com>

Good.

thanks,
Vladimir

On 9/1/16 10:00 AM, Dmitrij Pochepko wrote:
> Hi,
>
> please review small patch for 8165244 - Unquarantine
> compiler/jvmci/compilerToVM/ExecuteInstalledCodeTest.java
>
> A quarantine reason (JDK-8139383) is closed as duplicate of fixed
> JDK-8139700, so,
> compiler/jvmci/compilerToVM/ExecuteInstalledCodeTest.java should be
> unquarantined.
>
> webrev: http://cr.openjdk.java.net/~dpochepk/8165244/webrev.01/
>
> CR: https://bugs.openjdk.java.net/browse/JDK-8165244
>
> I've tested fix on linux-amd64.
>
> Thanks,
>
> Dmitrij
>

From dmitrij.pochepko at oracle.com  Thu Sep  1 17:38:56 2016
From: dmitrij.pochepko at oracle.com (Dmitrij Pochepko)
Date: Thu, 1 Sep 2016 20:38:56 +0300
Subject: RFR(XS): 8165244 - Unquarantine
	compiler/jvmci/compilerToVM/ExecuteInstalledCodeTest.java
In-Reply-To: <caea79ca-7a5f-604b-11fc-8f2785a394ac@oracle.com>
References: <fa4f43c6-fd16-f6e1-3269-59ad3327bdb6@oracle.com>
	<caea79ca-7a5f-604b-11fc-8f2785a394ac@oracle.com>
Message-ID: <d001b0da-d5c6-7370-a2ac-28cc3dc6451b@oracle.com>

Thank you!


On 01.09.2016 20:36, Vladimir Kozlov wrote:
> Good.
>
> thanks,
> Vladimir
>
> On 9/1/16 10:00 AM, Dmitrij Pochepko wrote:
>> Hi,
>>
>> please review small patch for 8165244 - Unquarantine
>> compiler/jvmci/compilerToVM/ExecuteInstalledCodeTest.java
>>
>> A quarantine reason (JDK-8139383) is closed as duplicate of fixed
>> JDK-8139700, so,
>> compiler/jvmci/compilerToVM/ExecuteInstalledCodeTest.java should be
>> unquarantined.
>>
>> webrev: http://cr.openjdk.java.net/~dpochepk/8165244/webrev.01/
>>
>> CR: https://bugs.openjdk.java.net/browse/JDK-8165244
>>
>> I've tested fix on linux-amd64.
>>
>> Thanks,
>>
>> Dmitrij
>>


From vladimir.kozlov at oracle.com  Thu Sep  1 17:40:57 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 1 Sep 2016 10:40:57 -0700
Subject: RFR(S): 8146096 - [TEST BUG]
	compiler/loopopts/UseCountedLoopSafepoints.java Timeouts
In-Reply-To: <20b0d0de-c992-e7e3-dff4-601cc6ab03f5@oracle.com>
References: <4d957fd9-442a-9096-ac01-0f39a974f6a2@oracle.com>
	<91e6de05-de11-d026-aa25-127775169479@oracle.com>
	<a28530eb-3262-126e-0732-a730ea7fae22@oracle.com>
	<6e2b1074-0573-b078-3ade-1d028e5fb377@oracle.com>
	<20b0d0de-c992-e7e3-dff4-601cc6ab03f5@oracle.com>
Message-ID: <b98cc0c5-da8c-4f16-a7c9-b0b4b04d7b7f@oracle.com>

Yes, this looks good.

On 9/1/16 7:28 AM, Dmitrij Pochepko wrote:
> Thank you for attentive review.
>
>
>> Wow! You wrote "parser" to C2 Ideal graph.
>>
>> // now, find SafePoint->CountedLoopEnd edge
>>
>> Actually CountedLoopEnd  input edge should point to SafePoint. Not
>> reverse. You should search from LoopEnd up.
> fixed
>>
>>  115    SafePoint    ===  112  1  107  1  1  10  110  [[ 116 ]]
>> SafePoint !orig=76 !jvms: SimpleTest::testMethod @ bci:21
>>  116    CountedLoopEnd    ===  115  105  [[ 117  118 ]] [lt]
>> P=0.500000, C=6633.000000 !orig=97,[80] !jvms: SimpleTest::testMethod
>> @ bci:5
>>
>> The test is too simple. Without UseCountedLoopSafepoints the loop is
>> folded to accum += 100. You need a little more complex.
> I've changed increment to combination of different shifts
>>
>> Also test should be run only with C2 (Tiered should be off). I don't
>> think require vm.opt.TieredStopAtLevel is the same as
>> -XX:-TieredCompilation.
> Requires expression just help to ensure level 4 is available, because
> testMethod specifically compiled on level=4 using WhiteBox
> (CompilerWhiteBoxTest.COMP_LEVEL_FULL_OPTIMIZATION).
> This differs from original test idea to disable tiered compilation and
> trigger compilation by running cycle 2 billion times, which made it a
> long "stress" test. Now that method compiled via WhiteBox, this test
> became much more faster

Got it.

Thanks,
Vladimir

>
> Please take a look at v03:
> http://cr.openjdk.java.net/~dpochepk/8146096/webrev.03/
>
> Thanks,
> Dmitrij
>>
>> Thanks,
>> Vladimir
>>
>> On 8/31/16 12:03 PM, Dmitrij Pochepko wrote:
>>> Hi,
>>>
>>> Please take a look at v02.
>>>
>>> I've rewritten this test. Now it launch vm with
>>> -XX:+UseCountedLoopSafepoints (restricting compilation to tested method
>>> with single simple counted loop) and checks that output of
>>> -XX:+PrintIdeal have edge SafePoint -> CountedLoopEnd
>>>
>>> Then, launch the same with -XX:-UseCountedLoopSafepoints and checks that
>>> there is no such edge.
>>>
>>> I've tested fix via rbt on all platforms.
>>>
>>> webrev: http://cr.openjdk.java.net/~dpochepk/8146096/webrev.02/
>>>
>>>
>>> Thanks,
>>>
>>> Dmitrij
>>>
>>> On 29.08.2016 19:28, Vladimir Kozlov wrote:
>>>> I am not against marking test as stress but I think the test itself is
>>>> not good. It should be rewrote. I added comment to the JBS with
>>>> discussion during original 8146096 RFR.
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 8/29/16 7:08 AM, Dmitrij Pochepko wrote:
>>>>> Hi,
>>>>>
>>>>> please review small fix for 8146096 - [TEST BUG]
>>>>> compiler/loopopts/UseCountedLoopSafepoints.java Timeouts
>>>>>
>>>>>
>>>>> Test timeouts on slow platforms, so, this fix adds execution control
>>>>> with respect to elapsed time. Also, test marked as stress.
>>>>>
>>>>> webrev: http://cr.openjdk.java.net/~dpochepk/8146096/webrev.01/
>>>>>
>>>>> CR: https://bugs.openjdk.java.net/browse/JDK-8146096
>>>>>
>>>>>
>>>>> I've tested this fix on linux-amd64.
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Dmitrij
>>>>>
>>>
>

From dmitrij.pochepko at oracle.com  Thu Sep  1 17:44:24 2016
From: dmitrij.pochepko at oracle.com (Dmitrij Pochepko)
Date: Thu, 1 Sep 2016 20:44:24 +0300
Subject: RFR(S): 8146096 - [TEST BUG]
	compiler/loopopts/UseCountedLoopSafepoints.java Timeouts
In-Reply-To: <b98cc0c5-da8c-4f16-a7c9-b0b4b04d7b7f@oracle.com>
References: <4d957fd9-442a-9096-ac01-0f39a974f6a2@oracle.com>
	<91e6de05-de11-d026-aa25-127775169479@oracle.com>
	<a28530eb-3262-126e-0732-a730ea7fae22@oracle.com>
	<6e2b1074-0573-b078-3ade-1d028e5fb377@oracle.com>
	<20b0d0de-c992-e7e3-dff4-601cc6ab03f5@oracle.com>
	<b98cc0c5-da8c-4f16-a7c9-b0b4b04d7b7f@oracle.com>
Message-ID: <3b88b0b2-4f2e-175e-c45e-e85e2ca719ef@oracle.com>

Thank you for review!


On 01.09.2016 20:40, Vladimir Kozlov wrote:
> Yes, this looks good.
>
> On 9/1/16 7:28 AM, Dmitrij Pochepko wrote:
>> Thank you for attentive review.
>>
>>
>>> Wow! You wrote "parser" to C2 Ideal graph.
>>>
>>> // now, find SafePoint->CountedLoopEnd edge
>>>
>>> Actually CountedLoopEnd  input edge should point to SafePoint. Not
>>> reverse. You should search from LoopEnd up.
>> fixed
>>>
>>>  115    SafePoint    ===  112  1  107  1  1  10  110  [[ 116 ]]
>>> SafePoint !orig=76 !jvms: SimpleTest::testMethod @ bci:21
>>>  116    CountedLoopEnd    ===  115  105  [[ 117  118 ]] [lt]
>>> P=0.500000, C=6633.000000 !orig=97,[80] !jvms: SimpleTest::testMethod
>>> @ bci:5
>>>
>>> The test is too simple. Without UseCountedLoopSafepoints the loop is
>>> folded to accum += 100. You need a little more complex.
>> I've changed increment to combination of different shifts
>>>
>>> Also test should be run only with C2 (Tiered should be off). I don't
>>> think require vm.opt.TieredStopAtLevel is the same as
>>> -XX:-TieredCompilation.
>> Requires expression just help to ensure level 4 is available, because
>> testMethod specifically compiled on level=4 using WhiteBox
>> (CompilerWhiteBoxTest.COMP_LEVEL_FULL_OPTIMIZATION).
>> This differs from original test idea to disable tiered compilation and
>> trigger compilation by running cycle 2 billion times, which made it a
>> long "stress" test. Now that method compiled via WhiteBox, this test
>> became much more faster
>
> Got it.
>
> Thanks,
> Vladimir
>
>>
>> Please take a look at v03:
>> http://cr.openjdk.java.net/~dpochepk/8146096/webrev.03/
>>
>> Thanks,
>> Dmitrij
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 8/31/16 12:03 PM, Dmitrij Pochepko wrote:
>>>> Hi,
>>>>
>>>> Please take a look at v02.
>>>>
>>>> I've rewritten this test. Now it launch vm with
>>>> -XX:+UseCountedLoopSafepoints (restricting compilation to tested 
>>>> method
>>>> with single simple counted loop) and checks that output of
>>>> -XX:+PrintIdeal have edge SafePoint -> CountedLoopEnd
>>>>
>>>> Then, launch the same with -XX:-UseCountedLoopSafepoints and checks 
>>>> that
>>>> there is no such edge.
>>>>
>>>> I've tested fix via rbt on all platforms.
>>>>
>>>> webrev: http://cr.openjdk.java.net/~dpochepk/8146096/webrev.02/
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Dmitrij
>>>>
>>>> On 29.08.2016 19:28, Vladimir Kozlov wrote:
>>>>> I am not against marking test as stress but I think the test 
>>>>> itself is
>>>>> not good. It should be rewrote. I added comment to the JBS with
>>>>> discussion during original 8146096 RFR.
>>>>>
>>>>> Thanks,
>>>>> Vladimir
>>>>>
>>>>> On 8/29/16 7:08 AM, Dmitrij Pochepko wrote:
>>>>>> Hi,
>>>>>>
>>>>>> please review small fix for 8146096 - [TEST BUG]
>>>>>> compiler/loopopts/UseCountedLoopSafepoints.java Timeouts
>>>>>>
>>>>>>
>>>>>> Test timeouts on slow platforms, so, this fix adds execution control
>>>>>> with respect to elapsed time. Also, test marked as stress.
>>>>>>
>>>>>> webrev: http://cr.openjdk.java.net/~dpochepk/8146096/webrev.01/
>>>>>>
>>>>>> CR: https://bugs.openjdk.java.net/browse/JDK-8146096
>>>>>>
>>>>>>
>>>>>> I've tested this fix on linux-amd64.
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Dmitrij
>>>>>>
>>>>
>>


From vladimir.kozlov at oracle.com  Thu Sep  1 17:47:56 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 1 Sep 2016 10:47:56 -0700
Subject: RFR(S): 8146096 - [TEST BUG]
	compiler/loopopts/UseCountedLoopSafepoints.java Timeouts
In-Reply-To: <b98cc0c5-da8c-4f16-a7c9-b0b4b04d7b7f@oracle.com>
References: <4d957fd9-442a-9096-ac01-0f39a974f6a2@oracle.com>
	<91e6de05-de11-d026-aa25-127775169479@oracle.com>
	<a28530eb-3262-126e-0732-a730ea7fae22@oracle.com>
	<6e2b1074-0573-b078-3ade-1d028e5fb377@oracle.com>
	<20b0d0de-c992-e7e3-dff4-601cc6ab03f5@oracle.com>
	<b98cc0c5-da8c-4f16-a7c9-b0b4b04d7b7f@oracle.com>
Message-ID: <c4285a46-f34c-69f4-f084-1e38d0eacc52@oracle.com>

I forgot an other useful C2 flag -XX:LoopUnrollLimit=0  to avoid loop 
iteration splitting. Otherwise you will get several loops (pre-, main- 
post-).

Thanks,
Vladimir

On 9/1/16 10:40 AM, Vladimir Kozlov wrote:
> Yes, this looks good.
>
> On 9/1/16 7:28 AM, Dmitrij Pochepko wrote:
>> Thank you for attentive review.
>>
>>
>>> Wow! You wrote "parser" to C2 Ideal graph.
>>>
>>> // now, find SafePoint->CountedLoopEnd edge
>>>
>>> Actually CountedLoopEnd  input edge should point to SafePoint. Not
>>> reverse. You should search from LoopEnd up.
>> fixed
>>>
>>>  115    SafePoint    ===  112  1  107  1  1  10  110  [[ 116 ]]
>>> SafePoint !orig=76 !jvms: SimpleTest::testMethod @ bci:21
>>>  116    CountedLoopEnd    ===  115  105  [[ 117  118 ]] [lt]
>>> P=0.500000, C=6633.000000 !orig=97,[80] !jvms: SimpleTest::testMethod
>>> @ bci:5
>>>
>>> The test is too simple. Without UseCountedLoopSafepoints the loop is
>>> folded to accum += 100. You need a little more complex.
>> I've changed increment to combination of different shifts
>>>
>>> Also test should be run only with C2 (Tiered should be off). I don't
>>> think require vm.opt.TieredStopAtLevel is the same as
>>> -XX:-TieredCompilation.
>> Requires expression just help to ensure level 4 is available, because
>> testMethod specifically compiled on level=4 using WhiteBox
>> (CompilerWhiteBoxTest.COMP_LEVEL_FULL_OPTIMIZATION).
>> This differs from original test idea to disable tiered compilation and
>> trigger compilation by running cycle 2 billion times, which made it a
>> long "stress" test. Now that method compiled via WhiteBox, this test
>> became much more faster
>
> Got it.
>
> Thanks,
> Vladimir
>
>>
>> Please take a look at v03:
>> http://cr.openjdk.java.net/~dpochepk/8146096/webrev.03/
>>
>> Thanks,
>> Dmitrij
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 8/31/16 12:03 PM, Dmitrij Pochepko wrote:
>>>> Hi,
>>>>
>>>> Please take a look at v02.
>>>>
>>>> I've rewritten this test. Now it launch vm with
>>>> -XX:+UseCountedLoopSafepoints (restricting compilation to tested method
>>>> with single simple counted loop) and checks that output of
>>>> -XX:+PrintIdeal have edge SafePoint -> CountedLoopEnd
>>>>
>>>> Then, launch the same with -XX:-UseCountedLoopSafepoints and checks
>>>> that
>>>> there is no such edge.
>>>>
>>>> I've tested fix via rbt on all platforms.
>>>>
>>>> webrev: http://cr.openjdk.java.net/~dpochepk/8146096/webrev.02/
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Dmitrij
>>>>
>>>> On 29.08.2016 19:28, Vladimir Kozlov wrote:
>>>>> I am not against marking test as stress but I think the test itself is
>>>>> not good. It should be rewrote. I added comment to the JBS with
>>>>> discussion during original 8146096 RFR.
>>>>>
>>>>> Thanks,
>>>>> Vladimir
>>>>>
>>>>> On 8/29/16 7:08 AM, Dmitrij Pochepko wrote:
>>>>>> Hi,
>>>>>>
>>>>>> please review small fix for 8146096 - [TEST BUG]
>>>>>> compiler/loopopts/UseCountedLoopSafepoints.java Timeouts
>>>>>>
>>>>>>
>>>>>> Test timeouts on slow platforms, so, this fix adds execution control
>>>>>> with respect to elapsed time. Also, test marked as stress.
>>>>>>
>>>>>> webrev: http://cr.openjdk.java.net/~dpochepk/8146096/webrev.01/
>>>>>>
>>>>>> CR: https://bugs.openjdk.java.net/browse/JDK-8146096
>>>>>>
>>>>>>
>>>>>> I've tested this fix on linux-amd64.
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Dmitrij
>>>>>>
>>>>
>>

From vladimir.kozlov at oracle.com  Thu Sep  1 17:52:14 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 1 Sep 2016 10:52:14 -0700
Subject: RFR(S): 8157956 - OverflowCodeCacheTest.java fails with Out of
	space in CodeCache for method handle intrinsic
In-Reply-To: <62af160f-2438-8dcf-db3e-8954f931951e@oracle.com>
References: <62af160f-2438-8dcf-db3e-8954f931951e@oracle.com>
Message-ID: <4445baa9-4c2b-3e14-ae31-2e5a32331981@oracle.com>

Good.

Thanks,
Vladimir

On 8/31/16 11:07 AM, Dmitrij Pochepko wrote:
> Hi,
>
> please review small fix for 8157956 - OverflowCodeCacheTest.java fails
> with Out of space in CodeCache for method handle intrinsic
>
> Test failed with "VirtualMachineError: Out of space for method handle
> intrinsic" because of code cache exhaustion. This happened on final
> assert. This fix moves assert after freeing codecache logic to get rid
> of error.
>
> I've tested it on linux-i586
>
> webrev: http://cr.openjdk.java.net/~dpochepk/8157956/webrev.01/
>
> CR: https://bugs.openjdk.java.net/browse/JDK-8157956
>
> Thanks,
>
> Dmitrij
>

From dmitrij.pochepko at oracle.com  Thu Sep  1 17:57:19 2016
From: dmitrij.pochepko at oracle.com (Dmitrij Pochepko)
Date: Thu, 1 Sep 2016 20:57:19 +0300
Subject: RFR(S): 8157956 - OverflowCodeCacheTest.java fails with Out of
	space in CodeCache for method handle intrinsic
In-Reply-To: <4445baa9-4c2b-3e14-ae31-2e5a32331981@oracle.com>
References: <62af160f-2438-8dcf-db3e-8954f931951e@oracle.com>
	<4445baa9-4c2b-3e14-ae31-2e5a32331981@oracle.com>
Message-ID: <7f9fbfef-0162-aa40-72ec-7ddf0ab8ede6@oracle.com>

Thank you!


On 01.09.2016 20:52, Vladimir Kozlov wrote:
> Good.
>
> Thanks,
> Vladimir
>
> On 8/31/16 11:07 AM, Dmitrij Pochepko wrote:
>> Hi,
>>
>> please review small fix for 8157956 - OverflowCodeCacheTest.java fails
>> with Out of space in CodeCache for method handle intrinsic
>>
>> Test failed with "VirtualMachineError: Out of space for method handle
>> intrinsic" because of code cache exhaustion. This happened on final
>> assert. This fix moves assert after freeing codecache logic to get rid
>> of error.
>>
>> I've tested it on linux-i586
>>
>> webrev: http://cr.openjdk.java.net/~dpochepk/8157956/webrev.01/
>>
>> CR: https://bugs.openjdk.java.net/browse/JDK-8157956
>>
>> Thanks,
>>
>> Dmitrij
>>


From dmitrij.pochepko at oracle.com  Thu Sep  1 18:03:19 2016
From: dmitrij.pochepko at oracle.com (Dmitrij Pochepko)
Date: Thu, 1 Sep 2016 21:03:19 +0300
Subject: RFR(S): 8146096 - [TEST BUG]
	compiler/loopopts/UseCountedLoopSafepoints.java Timeouts
In-Reply-To: <c4285a46-f34c-69f4-f084-1e38d0eacc52@oracle.com>
References: <4d957fd9-442a-9096-ac01-0f39a974f6a2@oracle.com>
	<91e6de05-de11-d026-aa25-127775169479@oracle.com>
	<a28530eb-3262-126e-0732-a730ea7fae22@oracle.com>
	<6e2b1074-0573-b078-3ade-1d028e5fb377@oracle.com>
	<20b0d0de-c992-e7e3-dff4-601cc6ab03f5@oracle.com>
	<b98cc0c5-da8c-4f16-a7c9-b0b4b04d7b7f@oracle.com>
	<c4285a46-f34c-69f4-f084-1e38d0eacc52@oracle.com>
Message-ID: <0456966a-6e5e-c293-a16e-59a2cc2fcc8a@oracle.com>

Hi,

I've added LoopUnrollLimit=0 to spawned vm options and tested on 
linux-amd64.

Please take a look at v04: 
http://cr.openjdk.java.net/~dpochepk/8146096/webrev.04/

Thanks,

Dmitrij

On 01.09.2016 20:47, Vladimir Kozlov wrote:
> I forgot an other useful C2 flag -XX:LoopUnrollLimit=0  to avoid loop 
> iteration splitting. Otherwise you will get several loops (pre-, main- 
> post-).
>
> Thanks,
> Vladimir
>
> On 9/1/16 10:40 AM, Vladimir Kozlov wrote:
>> Yes, this looks good.
>>
>> On 9/1/16 7:28 AM, Dmitrij Pochepko wrote:
>>> Thank you for attentive review.
>>>
>>>
>>>> Wow! You wrote "parser" to C2 Ideal graph.
>>>>
>>>> // now, find SafePoint->CountedLoopEnd edge
>>>>
>>>> Actually CountedLoopEnd  input edge should point to SafePoint. Not
>>>> reverse. You should search from LoopEnd up.
>>> fixed
>>>>
>>>>  115    SafePoint    ===  112  1  107  1  1  10  110  [[ 116 ]]
>>>> SafePoint !orig=76 !jvms: SimpleTest::testMethod @ bci:21
>>>>  116    CountedLoopEnd    ===  115  105  [[ 117  118 ]] [lt]
>>>> P=0.500000, C=6633.000000 !orig=97,[80] !jvms: SimpleTest::testMethod
>>>> @ bci:5
>>>>
>>>> The test is too simple. Without UseCountedLoopSafepoints the loop is
>>>> folded to accum += 100. You need a little more complex.
>>> I've changed increment to combination of different shifts
>>>>
>>>> Also test should be run only with C2 (Tiered should be off). I don't
>>>> think require vm.opt.TieredStopAtLevel is the same as
>>>> -XX:-TieredCompilation.
>>> Requires expression just help to ensure level 4 is available, because
>>> testMethod specifically compiled on level=4 using WhiteBox
>>> (CompilerWhiteBoxTest.COMP_LEVEL_FULL_OPTIMIZATION).
>>> This differs from original test idea to disable tiered compilation and
>>> trigger compilation by running cycle 2 billion times, which made it a
>>> long "stress" test. Now that method compiled via WhiteBox, this test
>>> became much more faster
>>
>> Got it.
>>
>> Thanks,
>> Vladimir
>>
>>>
>>> Please take a look at v03:
>>> http://cr.openjdk.java.net/~dpochepk/8146096/webrev.03/
>>>
>>> Thanks,
>>> Dmitrij
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 8/31/16 12:03 PM, Dmitrij Pochepko wrote:
>>>>> Hi,
>>>>>
>>>>> Please take a look at v02.
>>>>>
>>>>> I've rewritten this test. Now it launch vm with
>>>>> -XX:+UseCountedLoopSafepoints (restricting compilation to tested 
>>>>> method
>>>>> with single simple counted loop) and checks that output of
>>>>> -XX:+PrintIdeal have edge SafePoint -> CountedLoopEnd
>>>>>
>>>>> Then, launch the same with -XX:-UseCountedLoopSafepoints and checks
>>>>> that
>>>>> there is no such edge.
>>>>>
>>>>> I've tested fix via rbt on all platforms.
>>>>>
>>>>> webrev: http://cr.openjdk.java.net/~dpochepk/8146096/webrev.02/
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Dmitrij
>>>>>
>>>>> On 29.08.2016 19:28, Vladimir Kozlov wrote:
>>>>>> I am not against marking test as stress but I think the test 
>>>>>> itself is
>>>>>> not good. It should be rewrote. I added comment to the JBS with
>>>>>> discussion during original 8146096 RFR.
>>>>>>
>>>>>> Thanks,
>>>>>> Vladimir
>>>>>>
>>>>>> On 8/29/16 7:08 AM, Dmitrij Pochepko wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> please review small fix for 8146096 - [TEST BUG]
>>>>>>> compiler/loopopts/UseCountedLoopSafepoints.java Timeouts
>>>>>>>
>>>>>>>
>>>>>>> Test timeouts on slow platforms, so, this fix adds execution 
>>>>>>> control
>>>>>>> with respect to elapsed time. Also, test marked as stress.
>>>>>>>
>>>>>>> webrev: http://cr.openjdk.java.net/~dpochepk/8146096/webrev.01/
>>>>>>>
>>>>>>> CR: https://bugs.openjdk.java.net/browse/JDK-8146096
>>>>>>>
>>>>>>>
>>>>>>> I've tested this fix on linux-amd64.
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Dmitrij
>>>>>>>
>>>>>
>>>


From vladimir.kozlov at oracle.com  Thu Sep  1 18:03:53 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 1 Sep 2016 11:03:53 -0700
Subject: RFR(S): 8146096 - [TEST BUG]
	compiler/loopopts/UseCountedLoopSafepoints.java Timeouts
In-Reply-To: <0456966a-6e5e-c293-a16e-59a2cc2fcc8a@oracle.com>
References: <4d957fd9-442a-9096-ac01-0f39a974f6a2@oracle.com>
	<91e6de05-de11-d026-aa25-127775169479@oracle.com>
	<a28530eb-3262-126e-0732-a730ea7fae22@oracle.com>
	<6e2b1074-0573-b078-3ade-1d028e5fb377@oracle.com>
	<20b0d0de-c992-e7e3-dff4-601cc6ab03f5@oracle.com>
	<b98cc0c5-da8c-4f16-a7c9-b0b4b04d7b7f@oracle.com>
	<c4285a46-f34c-69f4-f084-1e38d0eacc52@oracle.com>
	<0456966a-6e5e-c293-a16e-59a2cc2fcc8a@oracle.com>
Message-ID: <e3600979-e880-ea87-a0ad-e553746c29f6@oracle.com>

Nice!

Thank you,
Vladimir

On 9/1/16 11:03 AM, Dmitrij Pochepko wrote:
> Hi,
>
> I've added LoopUnrollLimit=0 to spawned vm options and tested on
> linux-amd64.
>
> Please take a look at v04:
> http://cr.openjdk.java.net/~dpochepk/8146096/webrev.04/
>
> Thanks,
>
> Dmitrij
>
> On 01.09.2016 20:47, Vladimir Kozlov wrote:
>> I forgot an other useful C2 flag -XX:LoopUnrollLimit=0  to avoid loop
>> iteration splitting. Otherwise you will get several loops (pre-, main-
>> post-).
>>
>> Thanks,
>> Vladimir
>>
>> On 9/1/16 10:40 AM, Vladimir Kozlov wrote:
>>> Yes, this looks good.
>>>
>>> On 9/1/16 7:28 AM, Dmitrij Pochepko wrote:
>>>> Thank you for attentive review.
>>>>
>>>>
>>>>> Wow! You wrote "parser" to C2 Ideal graph.
>>>>>
>>>>> // now, find SafePoint->CountedLoopEnd edge
>>>>>
>>>>> Actually CountedLoopEnd  input edge should point to SafePoint. Not
>>>>> reverse. You should search from LoopEnd up.
>>>> fixed
>>>>>
>>>>>  115    SafePoint    ===  112  1  107  1  1  10  110  [[ 116 ]]
>>>>> SafePoint !orig=76 !jvms: SimpleTest::testMethod @ bci:21
>>>>>  116    CountedLoopEnd    ===  115  105  [[ 117  118 ]] [lt]
>>>>> P=0.500000, C=6633.000000 !orig=97,[80] !jvms: SimpleTest::testMethod
>>>>> @ bci:5
>>>>>
>>>>> The test is too simple. Without UseCountedLoopSafepoints the loop is
>>>>> folded to accum += 100. You need a little more complex.
>>>> I've changed increment to combination of different shifts
>>>>>
>>>>> Also test should be run only with C2 (Tiered should be off). I don't
>>>>> think require vm.opt.TieredStopAtLevel is the same as
>>>>> -XX:-TieredCompilation.
>>>> Requires expression just help to ensure level 4 is available, because
>>>> testMethod specifically compiled on level=4 using WhiteBox
>>>> (CompilerWhiteBoxTest.COMP_LEVEL_FULL_OPTIMIZATION).
>>>> This differs from original test idea to disable tiered compilation and
>>>> trigger compilation by running cycle 2 billion times, which made it a
>>>> long "stress" test. Now that method compiled via WhiteBox, this test
>>>> became much more faster
>>>
>>> Got it.
>>>
>>> Thanks,
>>> Vladimir
>>>
>>>>
>>>> Please take a look at v03:
>>>> http://cr.openjdk.java.net/~dpochepk/8146096/webrev.03/
>>>>
>>>> Thanks,
>>>> Dmitrij
>>>>>
>>>>> Thanks,
>>>>> Vladimir
>>>>>
>>>>> On 8/31/16 12:03 PM, Dmitrij Pochepko wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Please take a look at v02.
>>>>>>
>>>>>> I've rewritten this test. Now it launch vm with
>>>>>> -XX:+UseCountedLoopSafepoints (restricting compilation to tested
>>>>>> method
>>>>>> with single simple counted loop) and checks that output of
>>>>>> -XX:+PrintIdeal have edge SafePoint -> CountedLoopEnd
>>>>>>
>>>>>> Then, launch the same with -XX:-UseCountedLoopSafepoints and checks
>>>>>> that
>>>>>> there is no such edge.
>>>>>>
>>>>>> I've tested fix via rbt on all platforms.
>>>>>>
>>>>>> webrev: http://cr.openjdk.java.net/~dpochepk/8146096/webrev.02/
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Dmitrij
>>>>>>
>>>>>> On 29.08.2016 19:28, Vladimir Kozlov wrote:
>>>>>>> I am not against marking test as stress but I think the test
>>>>>>> itself is
>>>>>>> not good. It should be rewrote. I added comment to the JBS with
>>>>>>> discussion during original 8146096 RFR.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Vladimir
>>>>>>>
>>>>>>> On 8/29/16 7:08 AM, Dmitrij Pochepko wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> please review small fix for 8146096 - [TEST BUG]
>>>>>>>> compiler/loopopts/UseCountedLoopSafepoints.java Timeouts
>>>>>>>>
>>>>>>>>
>>>>>>>> Test timeouts on slow platforms, so, this fix adds execution
>>>>>>>> control
>>>>>>>> with respect to elapsed time. Also, test marked as stress.
>>>>>>>>
>>>>>>>> webrev: http://cr.openjdk.java.net/~dpochepk/8146096/webrev.01/
>>>>>>>>
>>>>>>>> CR: https://bugs.openjdk.java.net/browse/JDK-8146096
>>>>>>>>
>>>>>>>>
>>>>>>>> I've tested this fix on linux-amd64.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Dmitrij
>>>>>>>>
>>>>>>
>>>>
>

From dmitrij.pochepko at oracle.com  Thu Sep  1 18:07:52 2016
From: dmitrij.pochepko at oracle.com (Dmitrij Pochepko)
Date: Thu, 1 Sep 2016 21:07:52 +0300
Subject: RFR(S): 8146096 - [TEST BUG]
	compiler/loopopts/UseCountedLoopSafepoints.java Timeouts
In-Reply-To: <e3600979-e880-ea87-a0ad-e553746c29f6@oracle.com>
References: <4d957fd9-442a-9096-ac01-0f39a974f6a2@oracle.com>
	<91e6de05-de11-d026-aa25-127775169479@oracle.com>
	<a28530eb-3262-126e-0732-a730ea7fae22@oracle.com>
	<6e2b1074-0573-b078-3ade-1d028e5fb377@oracle.com>
	<20b0d0de-c992-e7e3-dff4-601cc6ab03f5@oracle.com>
	<b98cc0c5-da8c-4f16-a7c9-b0b4b04d7b7f@oracle.com>
	<c4285a46-f34c-69f4-f084-1e38d0eacc52@oracle.com>
	<0456966a-6e5e-c293-a16e-59a2cc2fcc8a@oracle.com>
	<e3600979-e880-ea87-a0ad-e553746c29f6@oracle.com>
Message-ID: <c9337c09-e847-139c-1e74-8c8d2b06836b@oracle.com>

Thank you!


On 01.09.2016 21:03, Vladimir Kozlov wrote:
> Nice!
>
> Thank you,
> Vladimir
>
> On 9/1/16 11:03 AM, Dmitrij Pochepko wrote:
>> Hi,
>>
>> I've added LoopUnrollLimit=0 to spawned vm options and tested on
>> linux-amd64.
>>
>> Please take a look at v04:
>> http://cr.openjdk.java.net/~dpochepk/8146096/webrev.04/
>>
>> Thanks,
>>
>> Dmitrij
>>
>> On 01.09.2016 20:47, Vladimir Kozlov wrote:
>>> I forgot an other useful C2 flag -XX:LoopUnrollLimit=0  to avoid loop
>>> iteration splitting. Otherwise you will get several loops (pre-, main-
>>> post-).
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 9/1/16 10:40 AM, Vladimir Kozlov wrote:
>>>> Yes, this looks good.
>>>>
>>>> On 9/1/16 7:28 AM, Dmitrij Pochepko wrote:
>>>>> Thank you for attentive review.
>>>>>
>>>>>
>>>>>> Wow! You wrote "parser" to C2 Ideal graph.
>>>>>>
>>>>>> // now, find SafePoint->CountedLoopEnd edge
>>>>>>
>>>>>> Actually CountedLoopEnd  input edge should point to SafePoint. Not
>>>>>> reverse. You should search from LoopEnd up.
>>>>> fixed
>>>>>>
>>>>>>  115    SafePoint    ===  112  1  107  1  1  10  110  [[ 116 ]]
>>>>>> SafePoint !orig=76 !jvms: SimpleTest::testMethod @ bci:21
>>>>>>  116    CountedLoopEnd    ===  115  105  [[ 117  118 ]] [lt]
>>>>>> P=0.500000, C=6633.000000 !orig=97,[80] !jvms: 
>>>>>> SimpleTest::testMethod
>>>>>> @ bci:5
>>>>>>
>>>>>> The test is too simple. Without UseCountedLoopSafepoints the loop is
>>>>>> folded to accum += 100. You need a little more complex.
>>>>> I've changed increment to combination of different shifts
>>>>>>
>>>>>> Also test should be run only with C2 (Tiered should be off). I don't
>>>>>> think require vm.opt.TieredStopAtLevel is the same as
>>>>>> -XX:-TieredCompilation.
>>>>> Requires expression just help to ensure level 4 is available, because
>>>>> testMethod specifically compiled on level=4 using WhiteBox
>>>>> (CompilerWhiteBoxTest.COMP_LEVEL_FULL_OPTIMIZATION).
>>>>> This differs from original test idea to disable tiered compilation 
>>>>> and
>>>>> trigger compilation by running cycle 2 billion times, which made it a
>>>>> long "stress" test. Now that method compiled via WhiteBox, this test
>>>>> became much more faster
>>>>
>>>> Got it.
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>>>
>>>>> Please take a look at v03:
>>>>> http://cr.openjdk.java.net/~dpochepk/8146096/webrev.03/
>>>>>
>>>>> Thanks,
>>>>> Dmitrij
>>>>>>
>>>>>> Thanks,
>>>>>> Vladimir
>>>>>>
>>>>>> On 8/31/16 12:03 PM, Dmitrij Pochepko wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> Please take a look at v02.
>>>>>>>
>>>>>>> I've rewritten this test. Now it launch vm with
>>>>>>> -XX:+UseCountedLoopSafepoints (restricting compilation to tested
>>>>>>> method
>>>>>>> with single simple counted loop) and checks that output of
>>>>>>> -XX:+PrintIdeal have edge SafePoint -> CountedLoopEnd
>>>>>>>
>>>>>>> Then, launch the same with -XX:-UseCountedLoopSafepoints and checks
>>>>>>> that
>>>>>>> there is no such edge.
>>>>>>>
>>>>>>> I've tested fix via rbt on all platforms.
>>>>>>>
>>>>>>> webrev: http://cr.openjdk.java.net/~dpochepk/8146096/webrev.02/
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Dmitrij
>>>>>>>
>>>>>>> On 29.08.2016 19:28, Vladimir Kozlov wrote:
>>>>>>>> I am not against marking test as stress but I think the test
>>>>>>>> itself is
>>>>>>>> not good. It should be rewrote. I added comment to the JBS with
>>>>>>>> discussion during original 8146096 RFR.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Vladimir
>>>>>>>>
>>>>>>>> On 8/29/16 7:08 AM, Dmitrij Pochepko wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> please review small fix for 8146096 - [TEST BUG]
>>>>>>>>> compiler/loopopts/UseCountedLoopSafepoints.java Timeouts
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Test timeouts on slow platforms, so, this fix adds execution
>>>>>>>>> control
>>>>>>>>> with respect to elapsed time. Also, test marked as stress.
>>>>>>>>>
>>>>>>>>> webrev: http://cr.openjdk.java.net/~dpochepk/8146096/webrev.01/
>>>>>>>>>
>>>>>>>>> CR: https://bugs.openjdk.java.net/browse/JDK-8146096
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I've tested this fix on linux-amd64.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Dmitrij
>>>>>>>>>
>>>>>>>
>>>>>
>>


From Leonid.Mesnik at oracle.com  Thu Sep  1 18:13:45 2016
From: Leonid.Mesnik at oracle.com (Leonid Mesnik)
Date: Thu, 1 Sep 2016 21:13:45 +0300
Subject: [9-dev] Request for review: JDK-8146128:
	compiler/cpuflags/TestAESIntrinsicsOnSupportedConfig timeouts
In-Reply-To: <0be045b2-ec1f-cf9b-bcf8-86ca602eadec@oracle.com>
References: <542E8041.1010101@oracle.com>
	<0be045b2-ec1f-cf9b-bcf8-86ca602eadec@oracle.com>
Message-ID: <57C86FD9.2030508@oracle.com>

Hi

The hotspot compiler changes should go to jdk9/hs-comp and not to 9-dev. 
Also hotspot-compiler-dev at openjdk.java.net alias should be used for 
compiler specific product and test changes.

It is unclear from issue description/comment what is the  root cause of 
failure and how it was fixed. Could you please add this information.

Leonid

On 01.09.2016 20:58, Alexander Vorobyev wrote:
>
> Hi All,
>
> I'd like review for JDK-8146128 
> (https://bugs.openjdk.java.net/browse/JDK-8146128)
>
> Test passes with timeout increased. Looks like it times out in 
> sub-tests where AESIntrinsics are disabled (testNoUseAES(), 
> testNoUseAESIntrinsic()). The easiest way to fix this test is to 
> increase timeout.
>
> Run parameter was added:
> @run main/othervm/timeout=300
>
>
> Here is webrev:
> http://cr.openjdk.java.net/~avorobye/8146128/webrev.00/
>
>
> Thanks,
> Alexander
>
>
>


From vladimir.kozlov at oracle.com  Thu Sep  1 18:15:50 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 1 Sep 2016 11:15:50 -0700
Subject: [9-dev] Request for review: JDK-8146128:
	compiler/cpuflags/TestAESIntrinsicsOnSupportedConfig timeouts
In-Reply-To: <57C86FD9.2030508@oracle.com>
References: <542E8041.1010101@oracle.com>
	<0be045b2-ec1f-cf9b-bcf8-86ca602eadec@oracle.com>
	<57C86FD9.2030508@oracle.com>
Message-ID: <a7688c55-ec1a-8cd0-8993-cb90213951ef@oracle.com>

Yes, removing jdk90dev from to:

300 is not enough. From bug report:

elapsed time (seconds): 482.214

An other way to solve that is to check remaining time after each test 
(forked VM) is executed and exit gracefully.

Thanks,
Vladimir


On 9/1/16 11:13 AM, Leonid Mesnik wrote:
> Hi
>
> The hotspot compiler changes should go to jdk9/hs-comp and not to 9-dev.
> Also hotspot-compiler-dev at openjdk.java.net alias should be used for
> compiler specific product and test changes.
>
> It is unclear from issue description/comment what is the  root cause of
> failure and how it was fixed. Could you please add this information.
>
> Leonid
>
> On 01.09.2016 20:58, Alexander Vorobyev wrote:
>>
>> Hi All,
>>
>> I'd like review for JDK-8146128
>> (https://bugs.openjdk.java.net/browse/JDK-8146128)
>>
>> Test passes with timeout increased. Looks like it times out in
>> sub-tests where AESIntrinsics are disabled (testNoUseAES(),
>> testNoUseAESIntrinsic()). The easiest way to fix this test is to
>> increase timeout.
>>
>> Run parameter was added:
>> @run main/othervm/timeout=300
>>
>>
>> Here is webrev:
>> http://cr.openjdk.java.net/~avorobye/8146128/webrev.00/
>>
>>
>> Thanks,
>> Alexander
>>
>>
>>
>

From alexander.vorobyev at oracle.com  Thu Sep  1 18:23:57 2016
From: alexander.vorobyev at oracle.com (Alexander Vorobyev)
Date: Thu, 1 Sep 2016 21:23:57 +0300
Subject: [9-dev] Request for review: JDK-8146128:
	compiler/cpuflags/TestAESIntrinsicsOnSupportedConfig timeouts
In-Reply-To: <57C86FD9.2030508@oracle.com>
References: <542E8041.1010101@oracle.com>
	<0be045b2-ec1f-cf9b-bcf8-86ca602eadec@oracle.com>
	<57C86FD9.2030508@oracle.com>
Message-ID: <9fa4959b-b6bf-1dcc-d909-d46d62748000@oracle.com>


This test uses test/compiler/codegen/aes/TestAESMain.java. It runs it 
with different AES-related flags and then analyzes output.  But 
TestAESMain.java performs very intensive calculations and it takes more 
time when axelerations are disabled (testNoUseAES(), 
testNoUseAESIntrinsic()). It is the cause of timeout.

On 01.09.2016 21:13, Leonid Mesnik wrote:
> Hi
>
> The hotspot compiler changes should go to jdk9/hs-comp and not to 
> 9-dev. Also hotspot-compiler-dev at openjdk.java.net alias should be used 
> for compiler specific product and test changes.
>
> It is unclear from issue description/comment what is the  root cause 
> of failure and how it was fixed. Could you please add this information.
>
> Leonid
>
> On 01.09.2016 20:58, Alexander Vorobyev wrote:
>>
>> Hi All,
>>
>> I'd like review for JDK-8146128 
>> (https://bugs.openjdk.java.net/browse/JDK-8146128)
>>
>> Test passes with timeout increased. Looks like it times out in 
>> sub-tests where AESIntrinsics are disabled (testNoUseAES(), 
>> testNoUseAESIntrinsic()). The easiest way to fix this test is to 
>> increase timeout.
>>
>> Run parameter was added:
>> @run main/othervm/timeout=300
>>
>>
>> Here is webrev:
>> http://cr.openjdk.java.net/~avorobye/8146128/webrev.00/
>>
>>
>> Thanks,
>> Alexander
>>
>>
>>
>


From alexander.vorobyev at oracle.com  Thu Sep  1 18:36:41 2016
From: alexander.vorobyev at oracle.com (Alexander Vorobyev)
Date: Thu, 1 Sep 2016 21:36:41 +0300
Subject: [9-dev] Request for review: JDK-8146128:
	compiler/cpuflags/TestAESIntrinsicsOnSupportedConfig timeouts
In-Reply-To: <a7688c55-ec1a-8cd0-8993-cb90213951ef@oracle.com>
References: <542E8041.1010101@oracle.com>
	<0be045b2-ec1f-cf9b-bcf8-86ca602eadec@oracle.com>
	<57C86FD9.2030508@oracle.com>
	<a7688c55-ec1a-8cd0-8993-cb90213951ef@oracle.com>
Message-ID: <2b3ca745-2d17-61a7-07fe-50ef619d8dde@oracle.com>

Do you mean to stop the test execution if there is not enough time 
remained? Even if not all test cases finished?


On 01.09.2016 21:15, Vladimir Kozlov wrote:
> Yes, removing jdk90dev from to:
>
> 300 is not enough. From bug report:
>
> elapsed time (seconds): 482.214
>
> An other way to solve that is to check remaining time after each test 
> (forked VM) is executed and exit gracefully.
>
> Thanks,
> Vladimir
>
>
>
> On 9/1/16 11:13 AM, Leonid Mesnik wrote:
>> Hi
>>
>> The hotspot compiler changes should go to jdk9/hs-comp and not to 9-dev.
>> Also hotspot-compiler-dev at openjdk.java.net alias should be used for
>> compiler specific product and test changes.
>>
>> It is unclear from issue description/comment what is the  root cause of
>> failure and how it was fixed. Could you please add this information.
>>
>> Leonid
>>
>> On 01.09.2016 20:58, Alexander Vorobyev wrote:
>>>
>>> Hi All,
>>>
>>> I'd like review for JDK-8146128
>>> (https://bugs.openjdk.java.net/browse/JDK-8146128)
>>>
>>> Test passes with timeout increased. Looks like it times out in
>>> sub-tests where AESIntrinsics are disabled (testNoUseAES(),
>>> testNoUseAESIntrinsic()). The easiest way to fix this test is to
>>> increase timeout.
>>>
>>> Run parameter was added:
>>> @run main/othervm/timeout=300
>>>
>>>
>>> Here is webrev:
>>> http://cr.openjdk.java.net/~avorobye/8146128/webrev.00/
>>>
>>>
>>> Thanks,
>>> Alexander
>>>
>>>
>>>
>>


From vladimir.kozlov at oracle.com  Thu Sep  1 18:38:32 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 1 Sep 2016 11:38:32 -0700
Subject: RFR(M) 8078122 : YMM registers upper 128 bits may get clobbered
	by a JNI call on windows
In-Reply-To: <F89640DCD01A85489FCBA68183A6A0F38EB91338@ORSMSX116.amr.corp.intel.com>
References: <F89640DCD01A85489FCBA68183A6A0F38EB8A9A7@ORSMSX116.amr.corp.intel.com>
	<57BE1AD4.7070403@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F38EB8AC2E@ORSMSX116.amr.corp.intel.com>
	<F89640DCD01A85489FCBA68183A6A0F38EB91338@ORSMSX116.amr.corp.intel.com>
Message-ID: <6aee0e7c-76a5-a920-7099-a3edc349f205@oracle.com>

Good. But looks like some code relied on old stack layout in stubs, for 
example sha256_AVX2():

#ifndef _WIN64
   _XMM_SAVE_SIZE = 0,
#else
   _XMM_SAVE_SIZE = 8*16,
#endif

Please, check that all other related code is fixed too. (I looked on all 
cases of _WIN64 in src/cpu/x86/vm/).

Thanks,
Vladimir

On 8/31/16 10:17 PM, Kharbas, Kishor wrote:
> Hello,
>
> I removed the unwanted save and restore of registers in the range XMM6-XMM31 from the x64_64 stubs.
> I also removed the #ifdef _WIN64 block from x86.ad file.
>
> Link to the new patch : http://cr.openjdk.java.net/~vdeshpande/8078122/webrev.01/
>
> Thanks
> Kishor
>
>
> -----Original Message-----
> From: Kharbas, Kishor
> Sent: Wednesday, August 24, 2016 6:24 PM
> To: Vladimir Kozlov <vladimir.kozlov at oracle.com>; hotspot-compiler-dev at openjdk.java.net
> Cc: Kharbas, Kishor <kishor.kharbas at intel.com>
> Subject: RE: RFR(M) 8078122 : YMM registers upper 128 bits may get clobbered by a JNI call on windows
>
> Thanks Vladimir for quick feedback.
> I will look into the stubs which save the registers in the range XMM6-XMM31. Also the first comment makes perfect sense.
>
> Thanks
> Kishor
>
> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Wednesday, August 24, 2016 3:08 PM
> To: Kharbas, Kishor <kishor.kharbas at intel.com>; hotspot-compiler-dev at openjdk.java.net
> Subject: Re: RFR(M) 8078122 : YMM registers upper 128 bits may get clobbered by a JNI call on windows
>
> Hi Kishor,
>
> First, #ifdef _WIN64 is not needed anymore since calling convention is similat to unix now.
>
> Second, I would like you to look more broadly. With this change we don't need to preserve XMM6-XMM31 in our stubs for WIN64. I am not sure that we can remove all #ifdef _WIN64 there but for most of them I think we can do. Please, look.
>
> Thanks,
> Vladimir
>
> On 8/24/16 2:40 PM, Kharbas, Kishor wrote:
>> Requesting the community to review the patch for
>> https://bugs.openjdk.java.net/browse/JDK-8078122
>>
>> Webrev : http://cr.openjdk.java.net/~vdeshpande/8078122/webrev.00
>>
>> The patch changes the definitions of registers XMM6-XMM31 for WIN64.
>>
>> Thank you.
>>
>> Kishor
>>

From vladimir.kozlov at oracle.com  Thu Sep  1 18:44:52 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 1 Sep 2016 11:44:52 -0700
Subject: [9-dev] Request for review: JDK-8146128:
	compiler/cpuflags/TestAESIntrinsicsOnSupportedConfig timeouts
In-Reply-To: <2b3ca745-2d17-61a7-07fe-50ef619d8dde@oracle.com>
References: <542E8041.1010101@oracle.com>
	<0be045b2-ec1f-cf9b-bcf8-86ca602eadec@oracle.com>
	<57C86FD9.2030508@oracle.com>
	<a7688c55-ec1a-8cd0-8993-cb90213951ef@oracle.com>
	<2b3ca745-2d17-61a7-07fe-50ef619d8dde@oracle.com>
Message-ID: <2c927fd6-17ad-137e-669d-822a1bab7c57@oracle.com>

Yes, in addition to timeout increase.

Because we can always find very slow platform (SPARC VM, for example) on 
which any reasonable timeout may be not enough. It would be rare cases 
with increased timeout so that skipping remaining tests is fine, I 
think. You can't increase timeout to hours.

Thanks,
Vladimir

On 9/1/16 11:36 AM, Alexander Vorobyev wrote:
> Do you mean to stop the test execution if there is not enough time
> remained? Even if not all test cases finished?
>
>
> On 01.09.2016 21:15, Vladimir Kozlov wrote:
>> Yes, removing jdk90dev from to:
>>
>> 300 is not enough. From bug report:
>>
>> elapsed time (seconds): 482.214
>>
>> An other way to solve that is to check remaining time after each test
>> (forked VM) is executed and exit gracefully.
>>
>> Thanks,
>> Vladimir
>>
>>
>>
>> On 9/1/16 11:13 AM, Leonid Mesnik wrote:
>>> Hi
>>>
>>> The hotspot compiler changes should go to jdk9/hs-comp and not to 9-dev.
>>> Also hotspot-compiler-dev at openjdk.java.net alias should be used for
>>> compiler specific product and test changes.
>>>
>>> It is unclear from issue description/comment what is the  root cause of
>>> failure and how it was fixed. Could you please add this information.
>>>
>>> Leonid
>>>
>>> On 01.09.2016 20:58, Alexander Vorobyev wrote:
>>>>
>>>> Hi All,
>>>>
>>>> I'd like review for JDK-8146128
>>>> (https://bugs.openjdk.java.net/browse/JDK-8146128)
>>>>
>>>> Test passes with timeout increased. Looks like it times out in
>>>> sub-tests where AESIntrinsics are disabled (testNoUseAES(),
>>>> testNoUseAESIntrinsic()). The easiest way to fix this test is to
>>>> increase timeout.
>>>>
>>>> Run parameter was added:
>>>> @run main/othervm/timeout=300
>>>>
>>>>
>>>> Here is webrev:
>>>> http://cr.openjdk.java.net/~avorobye/8146128/webrev.00/
>>>>
>>>>
>>>> Thanks,
>>>> Alexander
>>>>
>>>>
>>>>
>>>
>

From vladimir.kozlov at oracle.com  Thu Sep  1 18:56:12 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 1 Sep 2016 11:56:12 -0700
Subject: CR for RFR 8164989
In-Reply-To: <C568518E7B433348B114B6A7122D474756F0B4B8@FMSMSX102.amr.corp.intel.com>
References: <C568518E7B433348B114B6A7122D474756F0B4B8@FMSMSX102.amr.corp.intel.com>
Message-ID: <aa580195-616b-c984-8eae-3b4907077d0c@oracle.com>

Hi, Michael

Please, add comment which explain why it is disabled ( 0 && ).
File a bug (if you did not do that already) which will address the 
compress issue and reference it in the comment.

Thanks,
Vladimir

On 8/30/16 6:30 PM, Berg, Michael C wrote:
> Hi Folks,
>
> I would like to contribute a bug fix for SKX and KNL EVEX code gen.  The
> inflate and compress intrinsics on avx512 yield incorrect results and
> cause derby, sunflow, xml.transform and xml.validation to fail.  I have
> disabled the avx512 context for compress as it needs some rework and
> repaired inflate.  Please review the resultant code.
>
>
>
> This code was tested as follows: hotspot jreg, SPECjvm2008 bdw, skx, knl
> complete with no issues.  This change addresses
> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/4a39ee246f70 which
> was added in early May.
>
>
> Bug-id: https://bugs.openjdk.java.net/browse/JDK-8164989
>
>
> webrev:
>
> http://cr.openjdk.java.net/~mcberg/8164989/webrev/
>
>
>
> Regards,
>
> Michael
>

From alexander.vorobyev at oracle.com  Fri Sep  2 19:40:02 2016
From: alexander.vorobyev at oracle.com (Alexander Vorobyev)
Date: Fri, 2 Sep 2016 22:40:02 +0300
Subject: [9-dev] Request for review: JDK-8146128:
	compiler/cpuflags/TestAESIntrinsicsOnSupportedConfig timeouts
In-Reply-To: <2c927fd6-17ad-137e-669d-822a1bab7c57@oracle.com>
References: <542E8041.1010101@oracle.com>
	<0be045b2-ec1f-cf9b-bcf8-86ca602eadec@oracle.com>
	<57C86FD9.2030508@oracle.com>
	<a7688c55-ec1a-8cd0-8993-cb90213951ef@oracle.com>
	<2b3ca745-2d17-61a7-07fe-50ef619d8dde@oracle.com>
	<2c927fd6-17ad-137e-669d-822a1bab7c57@oracle.com>
Message-ID: <6814a4c0-272e-44a4-4dd3-e3dd30a61966@oracle.com>

Here is a new webrew: 
http://cr.openjdk.java.net/~avorobye/8146128/webrev.01/

Changes:

- timeout increased to 600;

- TestAESMain now runs with 100 iterations and 1000 iterations for 
warm-up with -XX:CompileThresholdScaling=0.01 option added.

Those changes allow our test to run much faster. Also, we still can be 
sure that methods are compiled (as I understand, by default compilation 
starts after 10000 iterations for server compiler, so settings listed 
above are suitable for us).

About checking the remained time - how can we predict whether remained 
time is still enough for the next test case? Also, those test cases have 
different duration - it also makes our suggestions about time very 
vague. And if we just skip some test cases, we never know about it from 
test results (because whole test will be marked as passed).  I am not 
sure, if we should add such functionality for really rare cases when 
there is not enough time. What do you think?


On 01.09.2016 21:44, Vladimir Kozlov wrote:
> Yes, in addition to timeout increase.
>
> Because we can always find very slow platform (SPARC VM, for example) 
> on which any reasonable timeout may be not enough. It would be rare 
> cases with increased timeout so that skipping remaining tests is fine, 
> I think. You can't increase timeout to hours.
>
> Thanks,
> Vladimir
>
> On 9/1/16 11:36 AM, Alexander Vorobyev wrote:
>> Do you mean to stop the test execution if there is not enough time
>> remained? Even if not all test cases finished?
>>
>>
>> On 01.09.2016 21:15, Vladimir Kozlov wrote:
>>> Yes, removing jdk90dev from to:
>>>
>>> 300 is not enough. From bug report:
>>>
>>> elapsed time (seconds): 482.214
>>>
>>> An other way to solve that is to check remaining time after each test
>>> (forked VM) is executed and exit gracefully.
>>>
>>> Thanks,
>>> Vladimir
>>>
>>>
>>>
>>> On 9/1/16 11:13 AM, Leonid Mesnik wrote:
>>>> Hi
>>>>
>>>> The hotspot compiler changes should go to jdk9/hs-comp and not to 
>>>> 9-dev.
>>>> Also hotspot-compiler-dev at openjdk.java.net alias should be used for
>>>> compiler specific product and test changes.
>>>>
>>>> It is unclear from issue description/comment what is the root cause of
>>>> failure and how it was fixed. Could you please add this information.
>>>>
>>>> Leonid
>>>>
>>>> On 01.09.2016 20:58, Alexander Vorobyev wrote:
>>>>>
>>>>> Hi All,
>>>>>
>>>>> I'd like review for JDK-8146128
>>>>> (https://bugs.openjdk.java.net/browse/JDK-8146128)
>>>>>
>>>>> Test passes with timeout increased. Looks like it times out in
>>>>> sub-tests where AESIntrinsics are disabled (testNoUseAES(),
>>>>> testNoUseAESIntrinsic()). The easiest way to fix this test is to
>>>>> increase timeout.
>>>>>
>>>>> Run parameter was added:
>>>>> @run main/othervm/timeout=300
>>>>>
>>>>>
>>>>> Here is webrev:
>>>>> http://cr.openjdk.java.net/~avorobye/8146128/webrev.00/
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Alexander
>>>>>
>>>>>
>>>>>
>>>>
>>


From michael.c.berg at intel.com  Fri Sep  2 20:05:13 2016
From: michael.c.berg at intel.com (Berg, Michael C)
Date: Fri, 2 Sep 2016 20:05:13 +0000
Subject: CR for RFR 8164989
In-Reply-To: <aa580195-616b-c984-8eae-3b4907077d0c@oracle.com>
References: <C568518E7B433348B114B6A7122D474756F0B4B8@FMSMSX102.amr.corp.intel.com>
	<aa580195-616b-c984-8eae-3b4907077d0c@oracle.com>
Message-ID: <C568518E7B433348B114B6A7122D474756F0C07F@FMSMSX102.amr.corp.intel.com>

Vladimir, please see the latest webrev for the comment addition:

http://cr.openjdk.java.net/~mcberg/8164989/webrev.02/

Also I have create a new bug and referenced it to this jbs issue (https://bugs.openjdk.java.net/browse/JDK-8164989 ).

See https://bugs.openjdk.java.net/browse/JDK-8165287 for details concerning the remaining issue for compress.

Thanks,
Michael


-----Original Message-----
From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] 
Sent: Thursday, September 01, 2016 11:56 AM
To: hotspot-compiler-dev at openjdk.java.net; Berg, Michael C <michael.c.berg at intel.com>
Subject: Re: CR for RFR 8164989

Hi, Michael

Please, add comment which explain why it is disabled ( 0 && ).
File a bug (if you did not do that already) which will address the compress issue and reference it in the comment.

Thanks,
Vladimir

On 8/30/16 6:30 PM, Berg, Michael C wrote:
> Hi Folks,
>
> I would like to contribute a bug fix for SKX and KNL EVEX code gen.  
> The inflate and compress intrinsics on avx512 yield incorrect results 
> and cause derby, sunflow, xml.transform and xml.validation to fail.  I 
> have disabled the avx512 context for compress as it needs some rework 
> and repaired inflate.  Please review the resultant code.
>
>
>
> This code was tested as follows: hotspot jreg, SPECjvm2008 bdw, skx, 
> knl complete with no issues.  This change addresses
> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/4a39ee246f70 which 
> was added in early May.
>
>
> Bug-id: https://bugs.openjdk.java.net/browse/JDK-8164989
>
>
> webrev:
>
> http://cr.openjdk.java.net/~mcberg/8164989/webrev/
>
>
>
> Regards,
>
> Michael
>

From kishor.kharbas at intel.com  Fri Sep  2 22:07:44 2016
From: kishor.kharbas at intel.com (Kharbas, Kishor)
Date: Fri, 2 Sep 2016 22:07:44 +0000
Subject: RFR(M) 8078122 : YMM registers upper 128 bits may get clobbered
	by a JNI call on windows
In-Reply-To: <6aee0e7c-76a5-a920-7099-a3edc349f205@oracle.com>
References: <F89640DCD01A85489FCBA68183A6A0F38EB8A9A7@ORSMSX116.amr.corp.intel.com>
	<57BE1AD4.7070403@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F38EB8AC2E@ORSMSX116.amr.corp.intel.com>
	<F89640DCD01A85489FCBA68183A6A0F38EB91338@ORSMSX116.amr.corp.intel.com>
	<6aee0e7c-76a5-a920-7099-a3edc349f205@oracle.com>
Message-ID: <F89640DCD01A85489FCBA68183A6A0F38EB92A1C@ORSMSX116.amr.corp.intel.com>

Thanks Vladimir,

I have updated the patch : http://cr.openjdk.java.net/~vdeshpande/8078122/webrev.02/

I looked for other places in src/cpu/x86/vm. I feel every case is covered.

- Kishor

-----Original Message-----
From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] 
Sent: Thursday, September 1, 2016 11:39 AM
To: Kharbas, Kishor <kishor.kharbas at intel.com>; hotspot-compiler-dev at openjdk.java.net
Subject: Re: RFR(M) 8078122 : YMM registers upper 128 bits may get clobbered by a JNI call on windows

Good. But looks like some code relied on old stack layout in stubs, for example sha256_AVX2():

#ifndef _WIN64
   _XMM_SAVE_SIZE = 0,
#else
   _XMM_SAVE_SIZE = 8*16,
#endif

Please, check that all other related code is fixed too. (I looked on all cases of _WIN64 in src/cpu/x86/vm/).

Thanks,
Vladimir

On 8/31/16 10:17 PM, Kharbas, Kishor wrote:
> Hello,
>
> I removed the unwanted save and restore of registers in the range XMM6-XMM31 from the x64_64 stubs.
> I also removed the #ifdef _WIN64 block from x86.ad file.
>
> Link to the new patch : 
> http://cr.openjdk.java.net/~vdeshpande/8078122/webrev.01/
>
> Thanks
> Kishor
>
>
> -----Original Message-----
> From: Kharbas, Kishor
> Sent: Wednesday, August 24, 2016 6:24 PM
> To: Vladimir Kozlov <vladimir.kozlov at oracle.com>; 
> hotspot-compiler-dev at openjdk.java.net
> Cc: Kharbas, Kishor <kishor.kharbas at intel.com>
> Subject: RE: RFR(M) 8078122 : YMM registers upper 128 bits may get 
> clobbered by a JNI call on windows
>
> Thanks Vladimir for quick feedback.
> I will look into the stubs which save the registers in the range XMM6-XMM31. Also the first comment makes perfect sense.
>
> Thanks
> Kishor
>
> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Wednesday, August 24, 2016 3:08 PM
> To: Kharbas, Kishor <kishor.kharbas at intel.com>; 
> hotspot-compiler-dev at openjdk.java.net
> Subject: Re: RFR(M) 8078122 : YMM registers upper 128 bits may get 
> clobbered by a JNI call on windows
>
> Hi Kishor,
>
> First, #ifdef _WIN64 is not needed anymore since calling convention is similat to unix now.
>
> Second, I would like you to look more broadly. With this change we don't need to preserve XMM6-XMM31 in our stubs for WIN64. I am not sure that we can remove all #ifdef _WIN64 there but for most of them I think we can do. Please, look.
>
> Thanks,
> Vladimir
>
> On 8/24/16 2:40 PM, Kharbas, Kishor wrote:
>> Requesting the community to review the patch for
>> https://bugs.openjdk.java.net/browse/JDK-8078122
>>
>> Webrev : http://cr.openjdk.java.net/~vdeshpande/8078122/webrev.00
>>
>> The patch changes the definitions of registers XMM6-XMM31 for WIN64.
>>
>> Thank you.
>>
>> Kishor
>>

From smita.kamath at intel.com  Fri Sep  2 22:35:56 2016
From: smita.kamath at intel.com (Kamath, Smita)
Date: Fri, 2 Sep 2016 22:35:56 +0000
Subject: FW: RFR (M): bug-id: bug summary
In-Reply-To: <02FCFB8477C4EF43A2AD8E0C60F3DA2B75AA104B@FMSMSX112.amr.corp.intel.com>
References: <02FCFB8477C4EF43A2AD8E0C60F3DA2B75AA104B@FMSMSX112.amr.corp.intel.com>
Message-ID: <6563F381B547594081EF9DE181D07912444A5AF3@FMSMSX119.amr.corp.intel.com>

Hi All,

I would like to contribute an optimization for SHA 512 towards JDK 9.1. This optimization shows ~2X improvement on X86_64 platforms.

Bug: https://bugs.openjdk.java.net/browse/JDK-8165381
Webrev: http://cr.openjdk.java.net/~vdeshpande/8165381/webrev.00/


Hotspot jtreg tests pass on Windows and Linux with this patch.

Please review and sponsor.


Thanks,
Smita Kamath
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160902/06ad96cd/attachment.html>

From smita.kamath at intel.com  Fri Sep  2 22:48:28 2016
From: smita.kamath at intel.com (Kamath, Smita)
Date: Fri, 2 Sep 2016 22:48:28 +0000
Subject: RFR (M): 8165381 : Update for x86 SHA512 using AVX2
Message-ID: <6563F381B547594081EF9DE181D07912444A5B15@FMSMSX119.amr.corp.intel.com>

Hi All,

I would like to contribute an optimization for SHA 512 towards JDK 9.1. This optimization shows ~2X improvement on X86_64 platforms.

Bug: https://bugs.openjdk.java.net/browse/JDK-8165381
Webrev: http://cr.openjdk.java.net/~vdeshpande/8165381/webrev.00/


Hotspot jtreg tests pass on Windows and Linux with this patch.

Please review and sponsor.


Thanks,
Smita Kamath
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160902/ffbc7527/attachment.html>

From jamsheed.c.m at oracle.com  Mon Sep  5 07:53:04 2016
From: jamsheed.c.m at oracle.com (Jamsheed C m)
Date: Mon, 5 Sep 2016 13:23:04 +0530
Subject: RFR: 8164508: unexpected profiling mismatch in c1 generated code
Message-ID: <6d873f34-2f65-e96f-0189-ca8af07ae824@oracle.com>

Hi,

webrev: http://cr.openjdk.java.net/~jcm/8164508/webrev.00/

bug id: https://bugs.openjdk.java.net/browse/JDK-8164508


we were skipping profiling of  first argument(recv)  for virtual call 
sites to static callee.  this was not done for non-inline case in c1. 
(see linked case for ref: https://bugs.openjdk.java.net/browse/JDK-8027631)

- bool has_receiver = x->inlined() && !x->callee()->is_static() && 
!Bytecodes::has_receiver(bc);
+ bool has_receiver = x->callee()->is_loaded() && 
!x->callee()->is_static() && !Bytecodes::has_receiver(bc); above change 
is not absolutely necessary as this can happen only for 
_linkToVirtual,_linkToInterface sites inlining at present, and linker 
elimination and callee inlining always happen together in c1. Please 
review, Best Regards, Jamsheed


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160905/beeaed3a/attachment-0001.html>

From goetz.lindenmaier at sap.com  Mon Sep  5 11:54:52 2016
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Mon, 5 Sep 2016 11:54:52 +0000
Subject: RFR(M): 8165235: [TESTBUG] RTM tests must check OS version
Message-ID: <f83c726cbcfd4eb596384c4e28f25a3f@DEWDFE13DE50.global.corp.sap>

Hi,

This fixes the RTM tests wrt. to supported platforms on ppc.
Please review this change. I please need a sponsor.
http://cr.openjdk.java.net/~goetz/wr16/8165235-osRecog/01/webrev.bs/
http://cr.openjdk.java.net/~goetz/wr16/8165235-osRecog/01/webrev.hs/

RTM uses special instructions that are only available on recent x86 cpus. On x86, this feature does not need OS support. On ppc, the equivalent functionality, hardware transactional memory, requires OS support. Thus the feature is only enabled by the VM if CPU and OS are at a specific level. The tests must check this. too. This holds for AIX and Linux.

To do so, this change introduces rtm/predicate/SupportedOS.java which checks for proper OS versions on ppc, else returns true.
The OS version is retrieved from Platform.java, which has new methods getOsVersionMajor() and  getOsVersionMinor().
To simplify the checks in the tests, I also introduced a 3-way AndPredicate constructor.

To simplify the OS version check on Aix, I change enabling RTM on Aix to require AIX 7.2.
Before, it was enabled on AIX 7.1.3.30, which contains an important bug fix.  The
last digits of this version are not exported to os.version property, so I can not
check for them in the test.

Best regards,
  Goetz.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160905/7c4bebaf/attachment.html>

From volker.simonis at gmail.com  Mon Sep  5 12:56:41 2016
From: volker.simonis at gmail.com (Volker Simonis)
Date: Mon, 5 Sep 2016 14:56:41 +0200
Subject: RFR(M): 8165235: [TESTBUG] RTM tests must check OS version
In-Reply-To: <f83c726cbcfd4eb596384c4e28f25a3f@DEWDFE13DE50.global.corp.sap>
References: <f83c726cbcfd4eb596384c4e28f25a3f@DEWDFE13DE50.global.corp.sap>
Message-ID: <CA+3eh11HfSGncvvWAymfMFqSRNV4VX7+WJpgkxWotJPGuh0jaA@mail.gmail.com>

Hi Goetz,

I think you've only forgot to import
compiler.testlibrary.rtm.predicate.SupportedOS into
test/compiler/rtm/cli/TestUseRTMLockingOptionWithBiasedLocking.java

Also, in SupportedOS.java the line:

public boolean getAsBoolean()

is indented to far (should be four spaces less like the annotation in
the line before).

Besides that, the change looks good.

Thanks for fixing this,
Volker

On Mon, Sep 5, 2016 at 1:54 PM, Lindenmaier, Goetz
<goetz.lindenmaier at sap.com> wrote:
> Hi,
>
>
>
> This fixes the RTM tests wrt. to supported platforms on ppc.
>
> Please review this change. I please need a sponsor.
> http://cr.openjdk.java.net/~goetz/wr16/8165235-osRecog/01/webrev.bs/
>
> http://cr.openjdk.java.net/~goetz/wr16/8165235-osRecog/01/webrev.hs/
>
>
> RTM uses special instructions that are only available on recent x86 cpus. On
> x86, this feature does not need OS support. On ppc, the equivalent
> functionality, hardware transactional memory, requires OS support. Thus the
> feature is only enabled by the VM if CPU and OS are at a specific level. The
> tests must check this. too. This holds for AIX and Linux.
>
>
>
> To do so, this change introduces rtm/predicate/SupportedOS.java which checks
> for proper OS versions on ppc, else returns true.
>
> The OS version is retrieved from Platform.java, which has new methods
> getOsVersionMajor() and  getOsVersionMinor().
>
> To simplify the checks in the tests, I also introduced a 3-way AndPredicate
> constructor.
>
>
>
> To simplify the OS version check on Aix, I change enabling RTM on Aix to
> require AIX 7.2.
>
> Before, it was enabled on AIX 7.1.3.30, which contains an important bug fix.
> The
>
> last digits of this version are not exported to os.version property, so I
> can not
>
> check for them in the test.
>
>
>
> Best regards,
>
>   Goetz.

From doug.simon at oracle.com  Mon Sep  5 16:45:56 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Mon, 5 Sep 2016 18:45:56 +0200
Subject: RFR: 8165434: [JVMCI] remove uses of setAccessible
Message-ID: <BA873BD3-C571-43B2-B41F-EE9FA35FE926@oracle.com>

JVMCI currently uses java.lang.reflect.AccessibleObject.setAccessible to get at private internals of certain JDK objects (e.g. java.lang.reflect.Method::slot). In light of changes around java.lang.reflect.AccessibleObject::setAccessible at http://openjdk.java.net/jeps/261, this may require extra command line options at some point. To avoid that, I?ve removed all uses of setAccessible in JVMCI.

http://cr.openjdk.java.net/~dnsimon/8165434/
https://bugs.openjdk.java.net/browse/JDK-8165434

-Doug

From doug.simon at oracle.com  Mon Sep  5 16:49:16 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Mon, 5 Sep 2016 18:49:16 +0200
Subject: RFR: 8165457: [JVMCI] increase InterpreterCodeSize for JVMCI
Message-ID: <39E38A4A-7DEB-49C3-BC8B-C41C9F0F0ED1@oracle.com>

In jvmci-8, we increased the interpreter code size when JVMCI code is included: 

http://hg.openjdk.java.net/graal/graal-jvmci-8/file/a074ae16281d/src/cpu/x86/vm/templateInterpreter_x86.hpp#l37 

This needs to also be done in jdk9.

https://bugs.openjdk.java.net/browse/JDK-8165457
http://cr.openjdk.java.net/~dnsimon/8165457/

-Doug

From goetz.lindenmaier at sap.com  Tue Sep  6 09:11:49 2016
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Tue, 6 Sep 2016 09:11:49 +0000
Subject: RFR(M): 8165235: [TESTBUG] RTM tests must check OS version
In-Reply-To: <CA+3eh11HfSGncvvWAymfMFqSRNV4VX7+WJpgkxWotJPGuh0jaA@mail.gmail.com>
References: <f83c726cbcfd4eb596384c4e28f25a3f@DEWDFE13DE50.global.corp.sap>
	<CA+3eh11HfSGncvvWAymfMFqSRNV4VX7+WJpgkxWotJPGuh0jaA@mail.gmail.com>
Message-ID: <a02689eb808c43a8899fd1578a2c9d00@DEWDFE13DE50.global.corp.sap>

Hi Volker, 

thanks for the review! I fixed the two issues:
http://cr.openjdk.java.net/~goetz/wr16/8165235-osRecog/02/webrev.hs/
http://cr.openjdk.java.net/~goetz/wr16/8165235-osRecog/02/webrev.bs/

Best regards,
  Goetz.


> -----Original Message-----
> From: Volker Simonis [mailto:volker.simonis at gmail.com]
> Sent: Montag, 5. September 2016 14:57
> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>
> Cc: hotspot-compiler-dev at openjdk.java.net
> Subject: Re: RFR(M): 8165235: [TESTBUG] RTM tests must check OS version
> 
> Hi Goetz,
> 
> I think you've only forgot to import
> compiler.testlibrary.rtm.predicate.SupportedOS into
> test/compiler/rtm/cli/TestUseRTMLockingOptionWithBiasedLocking.java
> 
> Also, in SupportedOS.java the line:
> 
> public boolean getAsBoolean()
> 
> is indented to far (should be four spaces less like the annotation in
> the line before).
> 
> Besides that, the change looks good.
> 
> Thanks for fixing this,
> Volker
> 
> On Mon, Sep 5, 2016 at 1:54 PM, Lindenmaier, Goetz
> <goetz.lindenmaier at sap.com> wrote:
> > Hi,
> >
> >
> >
> > This fixes the RTM tests wrt. to supported platforms on ppc.
> >
> > Please review this change. I please need a sponsor.
> > http://cr.openjdk.java.net/~goetz/wr16/8165235-osRecog/01/webrev.bs/
> >
> > http://cr.openjdk.java.net/~goetz/wr16/8165235-osRecog/01/webrev.hs/
> >
> >
> > RTM uses special instructions that are only available on recent x86 cpus. On
> > x86, this feature does not need OS support. On ppc, the equivalent
> > functionality, hardware transactional memory, requires OS support. Thus
> the
> > feature is only enabled by the VM if CPU and OS are at a specific level. The
> > tests must check this. too. This holds for AIX and Linux.
> >
> >
> >
> > To do so, this change introduces rtm/predicate/SupportedOS.java which
> checks
> > for proper OS versions on ppc, else returns true.
> >
> > The OS version is retrieved from Platform.java, which has new methods
> > getOsVersionMajor() and  getOsVersionMinor().
> >
> > To simplify the checks in the tests, I also introduced a 3-way AndPredicate
> > constructor.
> >
> >
> >
> > To simplify the OS version check on Aix, I change enabling RTM on Aix to
> > require AIX 7.2.
> >
> > Before, it was enabled on AIX 7.1.3.30, which contains an important bug fix.
> > The
> >
> > last digits of this version are not exported to os.version property, so I
> > can not
> >
> > check for them in the test.
> >
> >
> >
> > Best regards,
> >
> >   Goetz.

From volker.simonis at gmail.com  Tue Sep  6 10:21:03 2016
From: volker.simonis at gmail.com (Volker Simonis)
Date: Tue, 6 Sep 2016 12:21:03 +0200
Subject: RFR(M): 8165235: [TESTBUG] RTM tests must check OS version
In-Reply-To: <a02689eb808c43a8899fd1578a2c9d00@DEWDFE13DE50.global.corp.sap>
References: <f83c726cbcfd4eb596384c4e28f25a3f@DEWDFE13DE50.global.corp.sap>
	<CA+3eh11HfSGncvvWAymfMFqSRNV4VX7+WJpgkxWotJPGuh0jaA@mail.gmail.com>
	<a02689eb808c43a8899fd1578a2c9d00@DEWDFE13DE50.global.corp.sap>
Message-ID: <CA+3eh13rLhfyHP_LNReFk4HCB4twxA51HU+MEo8JWD_y86KqRQ@mail.gmail.com>

Thumbs up from me!

Volker

On Tue, Sep 6, 2016 at 11:11 AM, Lindenmaier, Goetz
<goetz.lindenmaier at sap.com> wrote:
> Hi Volker,
>
> thanks for the review! I fixed the two issues:
> http://cr.openjdk.java.net/~goetz/wr16/8165235-osRecog/02/webrev.hs/
> http://cr.openjdk.java.net/~goetz/wr16/8165235-osRecog/02/webrev.bs/
>
> Best regards,
>   Goetz.
>
>
>> -----Original Message-----
>> From: Volker Simonis [mailto:volker.simonis at gmail.com]
>> Sent: Montag, 5. September 2016 14:57
>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>
>> Cc: hotspot-compiler-dev at openjdk.java.net
>> Subject: Re: RFR(M): 8165235: [TESTBUG] RTM tests must check OS version
>>
>> Hi Goetz,
>>
>> I think you've only forgot to import
>> compiler.testlibrary.rtm.predicate.SupportedOS into
>> test/compiler/rtm/cli/TestUseRTMLockingOptionWithBiasedLocking.java
>>
>> Also, in SupportedOS.java the line:
>>
>> public boolean getAsBoolean()
>>
>> is indented to far (should be four spaces less like the annotation in
>> the line before).
>>
>> Besides that, the change looks good.
>>
>> Thanks for fixing this,
>> Volker
>>
>> On Mon, Sep 5, 2016 at 1:54 PM, Lindenmaier, Goetz
>> <goetz.lindenmaier at sap.com> wrote:
>> > Hi,
>> >
>> >
>> >
>> > This fixes the RTM tests wrt. to supported platforms on ppc.
>> >
>> > Please review this change. I please need a sponsor.
>> > http://cr.openjdk.java.net/~goetz/wr16/8165235-osRecog/01/webrev.bs/
>> >
>> > http://cr.openjdk.java.net/~goetz/wr16/8165235-osRecog/01/webrev.hs/
>> >
>> >
>> > RTM uses special instructions that are only available on recent x86 cpus. On
>> > x86, this feature does not need OS support. On ppc, the equivalent
>> > functionality, hardware transactional memory, requires OS support. Thus
>> the
>> > feature is only enabled by the VM if CPU and OS are at a specific level. The
>> > tests must check this. too. This holds for AIX and Linux.
>> >
>> >
>> >
>> > To do so, this change introduces rtm/predicate/SupportedOS.java which
>> checks
>> > for proper OS versions on ppc, else returns true.
>> >
>> > The OS version is retrieved from Platform.java, which has new methods
>> > getOsVersionMajor() and  getOsVersionMinor().
>> >
>> > To simplify the checks in the tests, I also introduced a 3-way AndPredicate
>> > constructor.
>> >
>> >
>> >
>> > To simplify the OS version check on Aix, I change enabling RTM on Aix to
>> > require AIX 7.2.
>> >
>> > Before, it was enabled on AIX 7.1.3.30, which contains an important bug fix.
>> > The
>> >
>> > last digits of this version are not exported to os.version property, so I
>> > can not
>> >
>> > check for them in the test.
>> >
>> >
>> >
>> > Best regards,
>> >
>> >   Goetz.

From filipp.zhinkin at gmail.com  Tue Sep  6 11:46:03 2016
From: filipp.zhinkin at gmail.com (Filipp Zhinkin)
Date: Tue, 6 Sep 2016 14:46:03 +0300
Subject: RFR(M): 8165235: [TESTBUG] RTM tests must check OS version
In-Reply-To: <CA+3eh13rLhfyHP_LNReFk4HCB4twxA51HU+MEo8JWD_y86KqRQ@mail.gmail.com>
References: <f83c726cbcfd4eb596384c4e28f25a3f@DEWDFE13DE50.global.corp.sap>
	<CA+3eh11HfSGncvvWAymfMFqSRNV4VX7+WJpgkxWotJPGuh0jaA@mail.gmail.com>
	<a02689eb808c43a8899fd1578a2c9d00@DEWDFE13DE50.global.corp.sap>
	<CA+3eh13rLhfyHP_LNReFk4HCB4twxA51HU+MEo8JWD_y86KqRQ@mail.gmail.com>
Message-ID: <CANQc0ndWJC9uWOz0Ytbs=Cf5APDijuSyyk8N7337oGCkk4TvuA@mail.gmail.com>

Hi,

I would suggest to use something like Boolean.TRUE::booleanValue
instead of null in AndPredicated ctor and use camel case for
Platform's fields and methods.
Otherwise the change looks good.

Just for the record: all those predicates where introduced because
there were no way to check OS/CPU/whatever using jtreg.
Now it should be possible to skip tests using jreg's @required tag. So
maybe we can get rid of some java code? :)
// Not suggesting to do it right now.

Regards,
Filipp.

On Tue, Sep 6, 2016 at 1:21 PM, Volker Simonis <volker.simonis at gmail.com> wrote:
> Thumbs up from me!
>
> Volker
>
> On Tue, Sep 6, 2016 at 11:11 AM, Lindenmaier, Goetz
> <goetz.lindenmaier at sap.com> wrote:
>> Hi Volker,
>>
>> thanks for the review! I fixed the two issues:
>> http://cr.openjdk.java.net/~goetz/wr16/8165235-osRecog/02/webrev.hs/
>> http://cr.openjdk.java.net/~goetz/wr16/8165235-osRecog/02/webrev.bs/
>>
>> Best regards,
>>   Goetz.
>>
>>
>>> -----Original Message-----
>>> From: Volker Simonis [mailto:volker.simonis at gmail.com]
>>> Sent: Montag, 5. September 2016 14:57
>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>
>>> Cc: hotspot-compiler-dev at openjdk.java.net
>>> Subject: Re: RFR(M): 8165235: [TESTBUG] RTM tests must check OS version
>>>
>>> Hi Goetz,
>>>
>>> I think you've only forgot to import
>>> compiler.testlibrary.rtm.predicate.SupportedOS into
>>> test/compiler/rtm/cli/TestUseRTMLockingOptionWithBiasedLocking.java
>>>
>>> Also, in SupportedOS.java the line:
>>>
>>> public boolean getAsBoolean()
>>>
>>> is indented to far (should be four spaces less like the annotation in
>>> the line before).
>>>
>>> Besides that, the change looks good.
>>>
>>> Thanks for fixing this,
>>> Volker
>>>
>>> On Mon, Sep 5, 2016 at 1:54 PM, Lindenmaier, Goetz
>>> <goetz.lindenmaier at sap.com> wrote:
>>> > Hi,
>>> >
>>> >
>>> >
>>> > This fixes the RTM tests wrt. to supported platforms on ppc.
>>> >
>>> > Please review this change. I please need a sponsor.
>>> > http://cr.openjdk.java.net/~goetz/wr16/8165235-osRecog/01/webrev.bs/
>>> >
>>> > http://cr.openjdk.java.net/~goetz/wr16/8165235-osRecog/01/webrev.hs/
>>> >
>>> >
>>> > RTM uses special instructions that are only available on recent x86 cpus. On
>>> > x86, this feature does not need OS support. On ppc, the equivalent
>>> > functionality, hardware transactional memory, requires OS support. Thus
>>> the
>>> > feature is only enabled by the VM if CPU and OS are at a specific level. The
>>> > tests must check this. too. This holds for AIX and Linux.
>>> >
>>> >
>>> >
>>> > To do so, this change introduces rtm/predicate/SupportedOS.java which
>>> checks
>>> > for proper OS versions on ppc, else returns true.
>>> >
>>> > The OS version is retrieved from Platform.java, which has new methods
>>> > getOsVersionMajor() and  getOsVersionMinor().
>>> >
>>> > To simplify the checks in the tests, I also introduced a 3-way AndPredicate
>>> > constructor.
>>> >
>>> >
>>> >
>>> > To simplify the OS version check on Aix, I change enabling RTM on Aix to
>>> > require AIX 7.2.
>>> >
>>> > Before, it was enabled on AIX 7.1.3.30, which contains an important bug fix.
>>> > The
>>> >
>>> > last digits of this version are not exported to os.version property, so I
>>> > can not
>>> >
>>> > check for them in the test.
>>> >
>>> >
>>> >
>>> > Best regards,
>>> >
>>> >   Goetz.

From goetz.lindenmaier at sap.com  Tue Sep  6 13:12:24 2016
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Tue, 6 Sep 2016 13:12:24 +0000
Subject: RFR(M): 8165235: [TESTBUG] RTM tests must check OS version
In-Reply-To: <CANQc0ndWJC9uWOz0Ytbs=Cf5APDijuSyyk8N7337oGCkk4TvuA@mail.gmail.com>
References: <f83c726cbcfd4eb596384c4e28f25a3f@DEWDFE13DE50.global.corp.sap>
	<CA+3eh11HfSGncvvWAymfMFqSRNV4VX7+WJpgkxWotJPGuh0jaA@mail.gmail.com>
	<a02689eb808c43a8899fd1578a2c9d00@DEWDFE13DE50.global.corp.sap>
	<CA+3eh13rLhfyHP_LNReFk4HCB4twxA51HU+MEo8JWD_y86KqRQ@mail.gmail.com>
	<CANQc0ndWJC9uWOz0Ytbs=Cf5APDijuSyyk8N7337oGCkk4TvuA@mail.gmail.com>
Message-ID: <28e894e35a3a431aa92d05b310b48970@DEWDFE13DE50.global.corp.sap>

Hi Filipp,

thanks for reviewing my change! 
I fixed the two issues:
http://cr.openjdk.java.net/~goetz/wr16/8165235-osRecog/03/webrev.bs/
http://cr.openjdk.java.net/~goetz/wr16/8165235-osRecog/03/webrev.hs/

The hotspot change is unchanged except for the reviewer attribution.

I also fixed the comment in Platform.java: major->minor.

Would you mind sponsoring the change?

Best regards,
  Goetz.


> -----Original Message-----
> From: Filipp Zhinkin [mailto:filipp.zhinkin at gmail.com]
> Sent: Dienstag, 6. September 2016 13:46
> To: Volker Simonis <volker.simonis at gmail.com>
> Cc: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; hotspot-compiler-
> dev at openjdk.java.net
> Subject: Re: RFR(M): 8165235: [TESTBUG] RTM tests must check OS version
> 
> Hi,
> 
> I would suggest to use something like Boolean.TRUE::booleanValue
> instead of null in AndPredicated ctor and use camel case for
> Platform's fields and methods.
> Otherwise the change looks good.
> 
> Just for the record: all those predicates where introduced because
> there were no way to check OS/CPU/whatever using jtreg.
> Now it should be possible to skip tests using jreg's @required tag. So
> maybe we can get rid of some java code? :)
> // Not suggesting to do it right now.
> 
> Regards,
> Filipp.
> 
> On Tue, Sep 6, 2016 at 1:21 PM, Volker Simonis <volker.simonis at gmail.com>
> wrote:
> > Thumbs up from me!
> >
> > Volker
> >
> > On Tue, Sep 6, 2016 at 11:11 AM, Lindenmaier, Goetz
> > <goetz.lindenmaier at sap.com> wrote:
> >> Hi Volker,
> >>
> >> thanks for the review! I fixed the two issues:
> >> http://cr.openjdk.java.net/~goetz/wr16/8165235-
> osRecog/02/webrev.hs/
> >> http://cr.openjdk.java.net/~goetz/wr16/8165235-
> osRecog/02/webrev.bs/
> >>
> >> Best regards,
> >>   Goetz.
> >>
> >>
> >>> -----Original Message-----
> >>> From: Volker Simonis [mailto:volker.simonis at gmail.com]
> >>> Sent: Montag, 5. September 2016 14:57
> >>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>
> >>> Cc: hotspot-compiler-dev at openjdk.java.net
> >>> Subject: Re: RFR(M): 8165235: [TESTBUG] RTM tests must check OS
> version
> >>>
> >>> Hi Goetz,
> >>>
> >>> I think you've only forgot to import
> >>> compiler.testlibrary.rtm.predicate.SupportedOS into
> >>> test/compiler/rtm/cli/TestUseRTMLockingOptionWithBiasedLocking.java
> >>>
> >>> Also, in SupportedOS.java the line:
> >>>
> >>> public boolean getAsBoolean()
> >>>
> >>> is indented to far (should be four spaces less like the annotation in
> >>> the line before).
> >>>
> >>> Besides that, the change looks good.
> >>>
> >>> Thanks for fixing this,
> >>> Volker
> >>>
> >>> On Mon, Sep 5, 2016 at 1:54 PM, Lindenmaier, Goetz
> >>> <goetz.lindenmaier at sap.com> wrote:
> >>> > Hi,
> >>> >
> >>> >
> >>> >
> >>> > This fixes the RTM tests wrt. to supported platforms on ppc.
> >>> >
> >>> > Please review this change. I please need a sponsor.
> >>> > http://cr.openjdk.java.net/~goetz/wr16/8165235-
> osRecog/01/webrev.bs/
> >>> >
> >>> > http://cr.openjdk.java.net/~goetz/wr16/8165235-
> osRecog/01/webrev.hs/
> >>> >
> >>> >
> >>> > RTM uses special instructions that are only available on recent x86
> cpus. On
> >>> > x86, this feature does not need OS support. On ppc, the equivalent
> >>> > functionality, hardware transactional memory, requires OS support.
> Thus
> >>> the
> >>> > feature is only enabled by the VM if CPU and OS are at a specific level.
> The
> >>> > tests must check this. too. This holds for AIX and Linux.
> >>> >
> >>> >
> >>> >
> >>> > To do so, this change introduces rtm/predicate/SupportedOS.java
> which
> >>> checks
> >>> > for proper OS versions on ppc, else returns true.
> >>> >
> >>> > The OS version is retrieved from Platform.java, which has new
> methods
> >>> > getOsVersionMajor() and  getOsVersionMinor().
> >>> >
> >>> > To simplify the checks in the tests, I also introduced a 3-way
> AndPredicate
> >>> > constructor.
> >>> >
> >>> >
> >>> >
> >>> > To simplify the OS version check on Aix, I change enabling RTM on Aix
> to
> >>> > require AIX 7.2.
> >>> >
> >>> > Before, it was enabled on AIX 7.1.3.30, which contains an important
> bug fix.
> >>> > The
> >>> >
> >>> > last digits of this version are not exported to os.version property, so I
> >>> > can not
> >>> >
> >>> > check for them in the test.
> >>> >
> >>> >
> >>> >
> >>> > Best regards,
> >>> >
> >>> >   Goetz.

From HORII at jp.ibm.com  Tue Sep  6 14:50:13 2016
From: HORII at jp.ibm.com (Hiroshi H Horii)
Date: Tue, 6 Sep 2016 23:50:13 +0900
Subject: RFR(m) 8164920: ppc: enhancement of CRC32 intrinsic
Message-ID: <OF3DBB7B4B.62D23BD5-ON49258026.0031B271-49258026.0051807D@notes.na.collabserv.com>

Dear Vladimir and all:

Can I please request reviews for the following change?

JIRA: https://bugs.openjdk.java.net/browse/JDK-8164920
webrev: http://cr.openjdk.java.net/~gromero/8164920/01/

As Volker's comments in the above JIRA, this is a ppc64-only improvement 
which will not 
affect any of the Oracle platforms in any way. 

This change includes new implementation of CRC32 Intrinsics for ppc64le.
In my local experiment, CRC32 of 64KB was calculated more than 20 times 
faster than original. 
Performance of CRC32 Intrinsic is important to run recent Apache 
Cassandra.
A Cassandra daemon needs to read 64KB data from a disk with CRC32 checksum 
by default.

This JIRA entry has "jdk9-fc-request" label.
If there is a chance to include new change in JDK 9 for ppc64le, I would 
like to request 
a review for this change.

Regards,
Hiroshi
-----------------------
Hiroshi Horii, Ph.D.
IBM Research - Tokyo


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160906/0e3d78b6/attachment.html>

From vladimir.kozlov at oracle.com  Tue Sep  6 16:08:45 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 6 Sep 2016 09:08:45 -0700
Subject: RFR: 8164508: unexpected profiling mismatch in c1 generated code
In-Reply-To: <6d873f34-2f65-e96f-0189-ca8af07ae824@oracle.com>
References: <6d873f34-2f65-e96f-0189-ca8af07ae824@oracle.com>
Message-ID: <3ace86b5-ff91-7d2c-9b74-e4b9c497365c@oracle.com>

Good.

thanks,
Vladimir

On 9/5/16 12:53 AM, Jamsheed C m wrote:
> Hi,
>
> webrev: http://cr.openjdk.java.net/~jcm/8164508/webrev.00/
>
> bug id: https://bugs.openjdk.java.net/browse/JDK-8164508
>
>
> we were skipping profiling of  first argument(recv)  for virtual call
> sites to static callee.  this was not done for non-inline case in c1.
> (see linked case for ref: https://bugs.openjdk.java.net/browse/JDK-8027631)
>
> - bool has_receiver = x->inlined() && !x->callee()->is_static() &&
> !Bytecodes::has_receiver(bc);
> + bool has_receiver = x->callee()->is_loaded() &&
> !x->callee()->is_static() && !Bytecodes::has_receiver(bc); above change
> is not absolutely necessary as this can happen only for
> _linkToVirtual,_linkToInterface sites inlining at present, and linker
> elimination and callee inlining always happen together in c1. Please
> review, Best Regards, Jamsheed
>
>
>

From vladimir.kozlov at oracle.com  Tue Sep  6 16:10:20 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 6 Sep 2016 09:10:20 -0700
Subject: CR for RFR 8164989
In-Reply-To: <C568518E7B433348B114B6A7122D474756F0C07F@FMSMSX102.amr.corp.intel.com>
References: <C568518E7B433348B114B6A7122D474756F0B4B8@FMSMSX102.amr.corp.intel.com>
	<aa580195-616b-c984-8eae-3b4907077d0c@oracle.com>
	<C568518E7B433348B114B6A7122D474756F0C07F@FMSMSX102.amr.corp.intel.com>
Message-ID: <8c9d34cf-7e2c-11b7-4ab4-be8070315fad@oracle.com>

Looks good. I will sponsor it.

Thanks,
Vladimir

On 9/2/16 1:05 PM, Berg, Michael C wrote:
> Vladimir, please see the latest webrev for the comment addition:
>
> http://cr.openjdk.java.net/~mcberg/8164989/webrev.02/
>
> Also I have create a new bug and referenced it to this jbs issue (https://bugs.openjdk.java.net/browse/JDK-8164989 ).
>
> See https://bugs.openjdk.java.net/browse/JDK-8165287 for details concerning the remaining issue for compress.
>
> Thanks,
> Michael
>
>
> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Thursday, September 01, 2016 11:56 AM
> To: hotspot-compiler-dev at openjdk.java.net; Berg, Michael C <michael.c.berg at intel.com>
> Subject: Re: CR for RFR 8164989
>
> Hi, Michael
>
> Please, add comment which explain why it is disabled ( 0 && ).
> File a bug (if you did not do that already) which will address the compress issue and reference it in the comment.
>
> Thanks,
> Vladimir
>
> On 8/30/16 6:30 PM, Berg, Michael C wrote:
>> Hi Folks,
>>
>> I would like to contribute a bug fix for SKX and KNL EVEX code gen.
>> The inflate and compress intrinsics on avx512 yield incorrect results
>> and cause derby, sunflow, xml.transform and xml.validation to fail.  I
>> have disabled the avx512 context for compress as it needs some rework
>> and repaired inflate.  Please review the resultant code.
>>
>>
>>
>> This code was tested as follows: hotspot jreg, SPECjvm2008 bdw, skx,
>> knl complete with no issues.  This change addresses
>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/rev/4a39ee246f70 which
>> was added in early May.
>>
>>
>> Bug-id: https://bugs.openjdk.java.net/browse/JDK-8164989
>>
>>
>> webrev:
>>
>> http://cr.openjdk.java.net/~mcberg/8164989/webrev/
>>
>>
>>
>> Regards,
>>
>> Michael
>>

From vladimir.kozlov at oracle.com  Tue Sep  6 16:49:06 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 6 Sep 2016 09:49:06 -0700
Subject: RFR: 8165457: [JVMCI] increase InterpreterCodeSize for JVMCI
In-Reply-To: <39E38A4A-7DEB-49C3-BC8B-C41C9F0F0ED1@oracle.com>
References: <39E38A4A-7DEB-49C3-BC8B-C41C9F0F0ED1@oracle.com>
Message-ID: <77b35cd8-0d95-d661-cc91-324112cdd62e@oracle.com>

Good.

Thanks,
Vladimir

On 9/5/16 9:49 AM, Doug Simon wrote:
> In jvmci-8, we increased the interpreter code size when JVMCI code is included:
>
> http://hg.openjdk.java.net/graal/graal-jvmci-8/file/a074ae16281d/src/cpu/x86/vm/templateInterpreter_x86.hpp#l37
>
> This needs to also be done in jdk9.
>
> https://bugs.openjdk.java.net/browse/JDK-8165457
> http://cr.openjdk.java.net/~dnsimon/8165457/
>
> -Doug
>

From vladimir.kozlov at oracle.com  Tue Sep  6 16:51:55 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 6 Sep 2016 09:51:55 -0700
Subject: [9-dev] Request for review: JDK-8146128:
	compiler/cpuflags/TestAESIntrinsicsOnSupportedConfig timeouts
In-Reply-To: <6814a4c0-272e-44a4-4dd3-e3dd30a61966@oracle.com>
References: <542E8041.1010101@oracle.com>
	<0be045b2-ec1f-cf9b-bcf8-86ca602eadec@oracle.com>
	<57C86FD9.2030508@oracle.com>
	<a7688c55-ec1a-8cd0-8993-cb90213951ef@oracle.com>
	<2b3ca745-2d17-61a7-07fe-50ef619d8dde@oracle.com>
	<2c927fd6-17ad-137e-669d-822a1bab7c57@oracle.com>
	<6814a4c0-272e-44a4-4dd3-e3dd30a61966@oracle.com>
Message-ID: <5b26f27d-ade9-1dab-692b-46f9c996d9b2@oracle.com>

Yes, this looks reasonable.

We may not need to check time between tests since you significantly 
reduced number of iterations.

I think we can go with these changes.

Thanks,
Vladimir

On 9/2/16 12:40 PM, Alexander Vorobyev wrote:
> Here is a new webrew:
> http://cr.openjdk.java.net/~avorobye/8146128/webrev.01/
>
> Changes:
>
> - timeout increased to 600;
>
> - TestAESMain now runs with 100 iterations and 1000 iterations for
> warm-up with -XX:CompileThresholdScaling=0.01 option added.
>
> Those changes allow our test to run much faster. Also, we still can be
> sure that methods are compiled (as I understand, by default compilation
> starts after 10000 iterations for server compiler, so settings listed
> above are suitable for us).
>
> About checking the remained time - how can we predict whether remained
> time is still enough for the next test case? Also, those test cases have
> different duration - it also makes our suggestions about time very
> vague. And if we just skip some test cases, we never know about it from
> test results (because whole test will be marked as passed).  I am not
> sure, if we should add such functionality for really rare cases when
> there is not enough time. What do you think?
>
>
> On 01.09.2016 21:44, Vladimir Kozlov wrote:
>> Yes, in addition to timeout increase.
>>
>> Because we can always find very slow platform (SPARC VM, for example)
>> on which any reasonable timeout may be not enough. It would be rare
>> cases with increased timeout so that skipping remaining tests is fine,
>> I think. You can't increase timeout to hours.
>>
>> Thanks,
>> Vladimir
>>
>> On 9/1/16 11:36 AM, Alexander Vorobyev wrote:
>>> Do you mean to stop the test execution if there is not enough time
>>> remained? Even if not all test cases finished?
>>>
>>>
>>> On 01.09.2016 21:15, Vladimir Kozlov wrote:
>>>> Yes, removing jdk90dev from to:
>>>>
>>>> 300 is not enough. From bug report:
>>>>
>>>> elapsed time (seconds): 482.214
>>>>
>>>> An other way to solve that is to check remaining time after each test
>>>> (forked VM) is executed and exit gracefully.
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>>
>>>>
>>>> On 9/1/16 11:13 AM, Leonid Mesnik wrote:
>>>>> Hi
>>>>>
>>>>> The hotspot compiler changes should go to jdk9/hs-comp and not to
>>>>> 9-dev.
>>>>> Also hotspot-compiler-dev at openjdk.java.net alias should be used for
>>>>> compiler specific product and test changes.
>>>>>
>>>>> It is unclear from issue description/comment what is the root cause of
>>>>> failure and how it was fixed. Could you please add this information.
>>>>>
>>>>> Leonid
>>>>>
>>>>> On 01.09.2016 20:58, Alexander Vorobyev wrote:
>>>>>>
>>>>>> Hi All,
>>>>>>
>>>>>> I'd like review for JDK-8146128
>>>>>> (https://bugs.openjdk.java.net/browse/JDK-8146128)
>>>>>>
>>>>>> Test passes with timeout increased. Looks like it times out in
>>>>>> sub-tests where AESIntrinsics are disabled (testNoUseAES(),
>>>>>> testNoUseAESIntrinsic()). The easiest way to fix this test is to
>>>>>> increase timeout.
>>>>>>
>>>>>> Run parameter was added:
>>>>>> @run main/othervm/timeout=300
>>>>>>
>>>>>>
>>>>>> Here is webrev:
>>>>>> http://cr.openjdk.java.net/~avorobye/8146128/webrev.00/
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Alexander
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>
>

From vladimir.kozlov at oracle.com  Tue Sep  6 17:12:13 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 6 Sep 2016 10:12:13 -0700
Subject: RFR(M) 8078122 : YMM registers upper 128 bits may get clobbered
	by a JNI call on windows
In-Reply-To: <F89640DCD01A85489FCBA68183A6A0F38EB92A1C@ORSMSX116.amr.corp.intel.com>
References: <F89640DCD01A85489FCBA68183A6A0F38EB8A9A7@ORSMSX116.amr.corp.intel.com>
	<57BE1AD4.7070403@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F38EB8AC2E@ORSMSX116.amr.corp.intel.com>
	<F89640DCD01A85489FCBA68183A6A0F38EB91338@ORSMSX116.amr.corp.intel.com>
	<6aee0e7c-76a5-a920-7099-a3edc349f205@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F38EB92A1C@ORSMSX116.amr.corp.intel.com>
Message-ID: <4af19c5d-9a7f-d18b-820b-6f3664b8183a@oracle.com>

Good. I start testing these changes. I will push it if testing pass.

Thanks,
Vladimir

On 9/2/16 3:07 PM, Kharbas, Kishor wrote:
> Thanks Vladimir,
>
> I have updated the patch : http://cr.openjdk.java.net/~vdeshpande/8078122/webrev.02/
>
> I looked for other places in src/cpu/x86/vm. I feel every case is covered.
>
> - Kishor
>
> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Thursday, September 1, 2016 11:39 AM
> To: Kharbas, Kishor <kishor.kharbas at intel.com>; hotspot-compiler-dev at openjdk.java.net
> Subject: Re: RFR(M) 8078122 : YMM registers upper 128 bits may get clobbered by a JNI call on windows
>
> Good. But looks like some code relied on old stack layout in stubs, for example sha256_AVX2():
>
> #ifndef _WIN64
>    _XMM_SAVE_SIZE = 0,
> #else
>    _XMM_SAVE_SIZE = 8*16,
> #endif
>
> Please, check that all other related code is fixed too. (I looked on all cases of _WIN64 in src/cpu/x86/vm/).
>
> Thanks,
> Vladimir
>
> On 8/31/16 10:17 PM, Kharbas, Kishor wrote:
>> Hello,
>>
>> I removed the unwanted save and restore of registers in the range XMM6-XMM31 from the x64_64 stubs.
>> I also removed the #ifdef _WIN64 block from x86.ad file.
>>
>> Link to the new patch :
>> http://cr.openjdk.java.net/~vdeshpande/8078122/webrev.01/
>>
>> Thanks
>> Kishor
>>
>>
>> -----Original Message-----
>> From: Kharbas, Kishor
>> Sent: Wednesday, August 24, 2016 6:24 PM
>> To: Vladimir Kozlov <vladimir.kozlov at oracle.com>;
>> hotspot-compiler-dev at openjdk.java.net
>> Cc: Kharbas, Kishor <kishor.kharbas at intel.com>
>> Subject: RE: RFR(M) 8078122 : YMM registers upper 128 bits may get
>> clobbered by a JNI call on windows
>>
>> Thanks Vladimir for quick feedback.
>> I will look into the stubs which save the registers in the range XMM6-XMM31. Also the first comment makes perfect sense.
>>
>> Thanks
>> Kishor
>>
>> -----Original Message-----
>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>> Sent: Wednesday, August 24, 2016 3:08 PM
>> To: Kharbas, Kishor <kishor.kharbas at intel.com>;
>> hotspot-compiler-dev at openjdk.java.net
>> Subject: Re: RFR(M) 8078122 : YMM registers upper 128 bits may get
>> clobbered by a JNI call on windows
>>
>> Hi Kishor,
>>
>> First, #ifdef _WIN64 is not needed anymore since calling convention is similat to unix now.
>>
>> Second, I would like you to look more broadly. With this change we don't need to preserve XMM6-XMM31 in our stubs for WIN64. I am not sure that we can remove all #ifdef _WIN64 there but for most of them I think we can do. Please, look.
>>
>> Thanks,
>> Vladimir
>>
>> On 8/24/16 2:40 PM, Kharbas, Kishor wrote:
>>> Requesting the community to review the patch for
>>> https://bugs.openjdk.java.net/browse/JDK-8078122
>>>
>>> Webrev : http://cr.openjdk.java.net/~vdeshpande/8078122/webrev.00
>>>
>>> The patch changes the definitions of registers XMM6-XMM31 for WIN64.
>>>
>>> Thank you.
>>>
>>> Kishor
>>>

From kishor.kharbas at intel.com  Tue Sep  6 18:08:57 2016
From: kishor.kharbas at intel.com (Kharbas, Kishor)
Date: Tue, 6 Sep 2016 18:08:57 +0000
Subject: RFR(M) 8078122 : YMM registers upper 128 bits may get clobbered
	by a JNI call on windows
In-Reply-To: <4af19c5d-9a7f-d18b-820b-6f3664b8183a@oracle.com>
References: <F89640DCD01A85489FCBA68183A6A0F38EB8A9A7@ORSMSX116.amr.corp.intel.com>
	<57BE1AD4.7070403@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F38EB8AC2E@ORSMSX116.amr.corp.intel.com>
	<F89640DCD01A85489FCBA68183A6A0F38EB91338@ORSMSX116.amr.corp.intel.com>
	<6aee0e7c-76a5-a920-7099-a3edc349f205@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F38EB92A1C@ORSMSX116.amr.corp.intel.com>
	<4af19c5d-9a7f-d18b-820b-6f3664b8183a@oracle.com>
Message-ID: <F89640DCD01A85489FCBA68183A6A0F38EB953C5@ORSMSX116.amr.corp.intel.com>

Thanks Vladimir!

-----Original Message-----
From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] 
Sent: Tuesday, September 6, 2016 10:12 AM
To: Kharbas, Kishor <kishor.kharbas at intel.com>; hotspot-compiler-dev at openjdk.java.net
Subject: Re: RFR(M) 8078122 : YMM registers upper 128 bits may get clobbered by a JNI call on windows

Good. I start testing these changes. I will push it if testing pass.

Thanks,
Vladimir

On 9/2/16 3:07 PM, Kharbas, Kishor wrote:
> Thanks Vladimir,
>
> I have updated the patch : 
> http://cr.openjdk.java.net/~vdeshpande/8078122/webrev.02/
>
> I looked for other places in src/cpu/x86/vm. I feel every case is covered.
>
> - Kishor
>
> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Thursday, September 1, 2016 11:39 AM
> To: Kharbas, Kishor <kishor.kharbas at intel.com>; 
> hotspot-compiler-dev at openjdk.java.net
> Subject: Re: RFR(M) 8078122 : YMM registers upper 128 bits may get 
> clobbered by a JNI call on windows
>
> Good. But looks like some code relied on old stack layout in stubs, for example sha256_AVX2():
>
> #ifndef _WIN64
>    _XMM_SAVE_SIZE = 0,
> #else
>    _XMM_SAVE_SIZE = 8*16,
> #endif
>
> Please, check that all other related code is fixed too. (I looked on all cases of _WIN64 in src/cpu/x86/vm/).
>
> Thanks,
> Vladimir
>
> On 8/31/16 10:17 PM, Kharbas, Kishor wrote:
>> Hello,
>>
>> I removed the unwanted save and restore of registers in the range XMM6-XMM31 from the x64_64 stubs.
>> I also removed the #ifdef _WIN64 block from x86.ad file.
>>
>> Link to the new patch :
>> http://cr.openjdk.java.net/~vdeshpande/8078122/webrev.01/
>>
>> Thanks
>> Kishor
>>
>>
>> -----Original Message-----
>> From: Kharbas, Kishor
>> Sent: Wednesday, August 24, 2016 6:24 PM
>> To: Vladimir Kozlov <vladimir.kozlov at oracle.com>; 
>> hotspot-compiler-dev at openjdk.java.net
>> Cc: Kharbas, Kishor <kishor.kharbas at intel.com>
>> Subject: RE: RFR(M) 8078122 : YMM registers upper 128 bits may get 
>> clobbered by a JNI call on windows
>>
>> Thanks Vladimir for quick feedback.
>> I will look into the stubs which save the registers in the range XMM6-XMM31. Also the first comment makes perfect sense.
>>
>> Thanks
>> Kishor
>>
>> -----Original Message-----
>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>> Sent: Wednesday, August 24, 2016 3:08 PM
>> To: Kharbas, Kishor <kishor.kharbas at intel.com>; 
>> hotspot-compiler-dev at openjdk.java.net
>> Subject: Re: RFR(M) 8078122 : YMM registers upper 128 bits may get 
>> clobbered by a JNI call on windows
>>
>> Hi Kishor,
>>
>> First, #ifdef _WIN64 is not needed anymore since calling convention is similat to unix now.
>>
>> Second, I would like you to look more broadly. With this change we don't need to preserve XMM6-XMM31 in our stubs for WIN64. I am not sure that we can remove all #ifdef _WIN64 there but for most of them I think we can do. Please, look.
>>
>> Thanks,
>> Vladimir
>>
>> On 8/24/16 2:40 PM, Kharbas, Kishor wrote:
>>> Requesting the community to review the patch for
>>> https://bugs.openjdk.java.net/browse/JDK-8078122
>>>
>>> Webrev : http://cr.openjdk.java.net/~vdeshpande/8078122/webrev.00
>>>
>>> The patch changes the definitions of registers XMM6-XMM31 for WIN64.
>>>
>>> Thank you.
>>>
>>> Kishor
>>>

From cthalinger at twitter.com  Tue Sep  6 18:12:21 2016
From: cthalinger at twitter.com (Christian Thalinger)
Date: Tue, 6 Sep 2016 08:12:21 -1000
Subject: RFR: 8165434: [JVMCI] remove uses of setAccessible
In-Reply-To: <BA873BD3-C571-43B2-B41F-EE9FA35FE926@oracle.com>
References: <BA873BD3-C571-43B2-B41F-EE9FA35FE926@oracle.com>
Message-ID: <864558C5-C2AD-4D6B-BB6F-568F00BBE28A@twitter.com>


> On Sep 5, 2016, at 6:45 AM, Doug Simon <doug.simon at oracle.com> wrote:
> 
> JVMCI currently uses java.lang.reflect.AccessibleObject.setAccessible to get at private internals of certain JDK objects (e.g. java.lang.reflect.Method::slot). In light of changes around java.lang.reflect.AccessibleObject::setAccessible at http://openjdk.java.net/jeps/261, this may require extra command line options at some point. To avoid that, I?ve removed all uses of setAccessible in JVMCI.
> 
> http://cr.openjdk.java.net/~dnsimon/8165434/

src/jdk.vm.ci/share/classes/jdk.vm.ci.meta/src/jdk/vm/ci/meta/ModifiersProvider.java

+    int BRIDGE = 0x0040;
+    int VARARGS = 0x0080;
+    int SYNTHETIC = 0x1000;
+    int ANNOTATION = 0x2000;
+    int ENUM = 0x4000;
I wish we could avoid that.  We can?t use this stuff because it?s HotSpot-dependent, right?
+        assert ModifiersProvider.SYNTHETIC == getConstant("JVM_ACC_SYNTHETIC", Integer.class);
+        assert ModifiersProvider.ANNOTATION == getConstant("JVM_ACC_ANNOTATION", Integer.class);
+        assert ModifiersProvider.BRIDGE == getConstant("JVM_ACC_BRIDGE", Integer.class);
+        assert ModifiersProvider.VARARGS == getConstant("JVM_ACC_VARARGS", Integer.class);
+        assert ModifiersProvider.ENUM == getConstant("JVM_ACC_ENUM", Integer.class);
What if we convert these constants to interface methods and the VM-dependent part has to implement them?  Or maybe even keep the fields and assign them via interface methods.

src/share/vm/jvmci/vmStructs_jvmci.cpp

   declare_constant(JVM_ACC_FIELD_HAS_GENERIC_SIGNATURE)                   \
+  declare_preprocessor_constant("JVM_ACC_VARARGS", JVM_ACC_VARARGS)       \
+  declare_preprocessor_constant("JVM_ACC_BRIDGE", JVM_ACC_BRIDGE)   \
+  declare_preprocessor_constant("JVM_ACC_ANNOTATION", JVM_ACC_ANNOTATION)   \
+  declare_preprocessor_constant("JVM_ACC_ENUM", JVM_ACC_ENUM)   \
   declare_preprocessor_constant("JVM_ACC_SYNTHETIC", JVM_ACC_SYNTHETIC)   \
Please align the ?\?.

Otherwise this looks good and generally a good cleanup.

> https://bugs.openjdk.java.net/browse/JDK-8165434
> 
> -Doug

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160906/7f2fa818/attachment-0001.html>

From cthalinger at twitter.com  Tue Sep  6 18:14:17 2016
From: cthalinger at twitter.com (Christian Thalinger)
Date: Tue, 6 Sep 2016 08:14:17 -1000
Subject: RFR: 8165457: [JVMCI] increase InterpreterCodeSize for JVMCI
In-Reply-To: <39E38A4A-7DEB-49C3-BC8B-C41C9F0F0ED1@oracle.com>
References: <39E38A4A-7DEB-49C3-BC8B-C41C9F0F0ED1@oracle.com>
Message-ID: <A493D25F-BA16-43B4-8719-56419ADB294F@twitter.com>


> On Sep 5, 2016, at 6:49 AM, Doug Simon <doug.simon at oracle.com> wrote:
> 
> In jvmci-8, we increased the interpreter code size when JVMCI code is included: 
> 
> http://hg.openjdk.java.net/graal/graal-jvmci-8/file/a074ae16281d/src/cpu/x86/vm/templateInterpreter_x86.hpp#l37 

What about SPARC?  Have we ever seen a problem there?  Or AArch64 for that matter?

> 
> This needs to also be done in jdk9.
> 
> https://bugs.openjdk.java.net/browse/JDK-8165457
> http://cr.openjdk.java.net/~dnsimon/8165457/
> 
> -Doug


From vladimir.kozlov at oracle.com  Tue Sep  6 21:31:16 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 6 Sep 2016 14:31:16 -0700
Subject: RFR(M) 8078122 : YMM registers upper 128 bits may get clobbered
	by a JNI call on windows
In-Reply-To: <4af19c5d-9a7f-d18b-820b-6f3664b8183a@oracle.com>
References: <F89640DCD01A85489FCBA68183A6A0F38EB8A9A7@ORSMSX116.amr.corp.intel.com>
	<57BE1AD4.7070403@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F38EB8AC2E@ORSMSX116.amr.corp.intel.com>
	<F89640DCD01A85489FCBA68183A6A0F38EB91338@ORSMSX116.amr.corp.intel.com>
	<6aee0e7c-76a5-a920-7099-a3edc349f205@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F38EB92A1C@ORSMSX116.amr.corp.intel.com>
	<4af19c5d-9a7f-d18b-820b-6f3664b8183a@oracle.com>
Message-ID: <7de8489c-943b-5ecf-48c1-0bffad101070@oracle.com>

Next jtreg test failed on 32-bit Linux:

hotspot/test/compiler/runtime/Test7196199.java

----------System.err:(57/2416)----------
test_incrc: [41] = 8.081506E20 != 150000.0
test_incrc: [42] = 1.8632992E31 != 150000.0
test_incrc: [43] = 2.8397877E29 != 150000.0
...

https://bugs.openjdk.java.net/browse/JDK-7196199

was related to Upper bits (64-255) of XMM (YMM) registers are not 
saved/restored in interrupt handle code during safepoint.

Looks like your changes are not enough.

Vladimir


On 9/6/16 10:12 AM, Vladimir Kozlov wrote:
> Good. I start testing these changes. I will push it if testing pass.
>
> Thanks,
> Vladimir
>
> On 9/2/16 3:07 PM, Kharbas, Kishor wrote:
>> Thanks Vladimir,
>>
>> I have updated the patch :
>> http://cr.openjdk.java.net/~vdeshpande/8078122/webrev.02/
>>
>> I looked for other places in src/cpu/x86/vm. I feel every case is
>> covered.
>>
>> - Kishor
>>
>> -----Original Message-----
>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>> Sent: Thursday, September 1, 2016 11:39 AM
>> To: Kharbas, Kishor <kishor.kharbas at intel.com>;
>> hotspot-compiler-dev at openjdk.java.net
>> Subject: Re: RFR(M) 8078122 : YMM registers upper 128 bits may get
>> clobbered by a JNI call on windows
>>
>> Good. But looks like some code relied on old stack layout in stubs,
>> for example sha256_AVX2():
>>
>> #ifndef _WIN64
>>    _XMM_SAVE_SIZE = 0,
>> #else
>>    _XMM_SAVE_SIZE = 8*16,
>> #endif
>>
>> Please, check that all other related code is fixed too. (I looked on
>> all cases of _WIN64 in src/cpu/x86/vm/).
>>
>> Thanks,
>> Vladimir
>>
>> On 8/31/16 10:17 PM, Kharbas, Kishor wrote:
>>> Hello,
>>>
>>> I removed the unwanted save and restore of registers in the range
>>> XMM6-XMM31 from the x64_64 stubs.
>>> I also removed the #ifdef _WIN64 block from x86.ad file.
>>>
>>> Link to the new patch :
>>> http://cr.openjdk.java.net/~vdeshpande/8078122/webrev.01/
>>>
>>> Thanks
>>> Kishor
>>>
>>>
>>> -----Original Message-----
>>> From: Kharbas, Kishor
>>> Sent: Wednesday, August 24, 2016 6:24 PM
>>> To: Vladimir Kozlov <vladimir.kozlov at oracle.com>;
>>> hotspot-compiler-dev at openjdk.java.net
>>> Cc: Kharbas, Kishor <kishor.kharbas at intel.com>
>>> Subject: RE: RFR(M) 8078122 : YMM registers upper 128 bits may get
>>> clobbered by a JNI call on windows
>>>
>>> Thanks Vladimir for quick feedback.
>>> I will look into the stubs which save the registers in the range
>>> XMM6-XMM31. Also the first comment makes perfect sense.
>>>
>>> Thanks
>>> Kishor
>>>
>>> -----Original Message-----
>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>> Sent: Wednesday, August 24, 2016 3:08 PM
>>> To: Kharbas, Kishor <kishor.kharbas at intel.com>;
>>> hotspot-compiler-dev at openjdk.java.net
>>> Subject: Re: RFR(M) 8078122 : YMM registers upper 128 bits may get
>>> clobbered by a JNI call on windows
>>>
>>> Hi Kishor,
>>>
>>> First, #ifdef _WIN64 is not needed anymore since calling convention
>>> is similat to unix now.
>>>
>>> Second, I would like you to look more broadly. With this change we
>>> don't need to preserve XMM6-XMM31 in our stubs for WIN64. I am not
>>> sure that we can remove all #ifdef _WIN64 there but for most of them
>>> I think we can do. Please, look.
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 8/24/16 2:40 PM, Kharbas, Kishor wrote:
>>>> Requesting the community to review the patch for
>>>> https://bugs.openjdk.java.net/browse/JDK-8078122
>>>>
>>>> Webrev : http://cr.openjdk.java.net/~vdeshpande/8078122/webrev.00
>>>>
>>>> The patch changes the definitions of registers XMM6-XMM31 for WIN64.
>>>>
>>>> Thank you.
>>>>
>>>> Kishor
>>>>

From doug.simon at oracle.com  Tue Sep  6 21:37:52 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Tue, 6 Sep 2016 23:37:52 +0200
Subject: RFR: 8165457: [JVMCI] increase InterpreterCodeSize for JVMCI
In-Reply-To: <A493D25F-BA16-43B4-8719-56419ADB294F@twitter.com>
References: <39E38A4A-7DEB-49C3-BC8B-C41C9F0F0ED1@oracle.com>
	<A493D25F-BA16-43B4-8719-56419ADB294F@twitter.com>
Message-ID: <7ED300F2-253B-4550-BF5E-878A99EDAEB2@oracle.com>


> On 06 Sep 2016, at 20:14, Christian Thalinger <cthalinger at twitter.com> wrote:
> 
> 
>> On Sep 5, 2016, at 6:49 AM, Doug Simon <doug.simon at oracle.com> wrote:
>> 
>> In jvmci-8, we increased the interpreter code size when JVMCI code is included: 
>> 
>> http://hg.openjdk.java.net/graal/graal-jvmci-8/file/a074ae16281d/src/cpu/x86/vm/templateInterpreter_x86.hpp#l37 
> 
> What about SPARC?  Have we ever seen a problem there?  Or AArch64 for that matter?

I?ve only ever seen problems on AMD64. I?ve never seen it on SPARC and have never run on AArch64.

The real fix is that the interpreter generator should never have to guess the size of the code buffer it needs but should resize things as needed after generating the interpreter.

-Doug

From cthalinger at twitter.com  Tue Sep  6 21:39:17 2016
From: cthalinger at twitter.com (Christian Thalinger)
Date: Tue, 6 Sep 2016 11:39:17 -1000
Subject: JVMCI compiler thread idle state is RUNNABLE
Message-ID: <959DA194-390F-49B2-97FA-CE402CA9D03D@twitter.com>

One thing we noticed here at Twitter is that JVMCI threads are not hidden (is_hidden_from_external_view) but at the same time they show up as always active.

I don?t know the history here but I?m speculating that since compiler threads were always hidden no-one bothered.  In the SIGQUIT thread dump compiler threads show up as RUNNABLE:

"C1 CompilerThread3" #8 daemon prio=9 os_prio=31 tid=0x00007fdcc2016800 nid=0x5103 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread2" #7 daemon prio=9 os_prio=31 tid=0x00007fdcc2821800 nid=0x4f03 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

The specification of RUNNABLE is:

        /**
         * Thread state for a runnable thread.  A thread in the runnable
         * state is executing in the Java virtual machine but it may
         * be waiting for other resources from the operating system
         * such as processor.
         */
        RUNNABLE,

and that makes sense.  But this is very confusing to the user (as one of our internal users reported to me).  Maybe JVMCI threads should just be hidden, too?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160906/624c82a5/attachment.html>

From cthalinger at twitter.com  Tue Sep  6 21:58:08 2016
From: cthalinger at twitter.com (Christian Thalinger)
Date: Tue, 6 Sep 2016 11:58:08 -1000
Subject: RFR: 8165457: [JVMCI] increase InterpreterCodeSize for JVMCI
In-Reply-To: <7ED300F2-253B-4550-BF5E-878A99EDAEB2@oracle.com>
References: <39E38A4A-7DEB-49C3-BC8B-C41C9F0F0ED1@oracle.com>
	<A493D25F-BA16-43B4-8719-56419ADB294F@twitter.com>
	<7ED300F2-253B-4550-BF5E-878A99EDAEB2@oracle.com>
Message-ID: <92B9E4F8-DF56-475B-A9EC-6FB179C58925@twitter.com>


> On Sep 6, 2016, at 11:37 AM, Doug Simon <doug.simon at oracle.com> wrote:
> 
> 
>> On 06 Sep 2016, at 20:14, Christian Thalinger <cthalinger at twitter.com> wrote:
>> 
>> 
>>> On Sep 5, 2016, at 6:49 AM, Doug Simon <doug.simon at oracle.com> wrote:
>>> 
>>> In jvmci-8, we increased the interpreter code size when JVMCI code is included: 
>>> 
>>> http://hg.openjdk.java.net/graal/graal-jvmci-8/file/a074ae16281d/src/cpu/x86/vm/templateInterpreter_x86.hpp#l37 
>> 
>> What about SPARC?  Have we ever seen a problem there?  Or AArch64 for that matter?
> 
> I?ve only ever seen problems on AMD64. I?ve never seen it on SPARC and have never run on AArch64.
> 
> The real fix is that the interpreter generator should never have to guess the size of the code buffer it needs but should resize things as needed after generating the interpreter.

Yes, it should.

From kishor.kharbas at intel.com  Wed Sep  7 00:39:31 2016
From: kishor.kharbas at intel.com (Kharbas, Kishor)
Date: Wed, 7 Sep 2016 00:39:31 +0000
Subject: RFR(M) 8078122 : YMM registers upper 128 bits may get clobbered
	by a JNI call on windows
In-Reply-To: <7de8489c-943b-5ecf-48c1-0bffad101070@oracle.com>
References: <F89640DCD01A85489FCBA68183A6A0F38EB8A9A7@ORSMSX116.amr.corp.intel.com>
	<57BE1AD4.7070403@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F38EB8AC2E@ORSMSX116.amr.corp.intel.com>
	<F89640DCD01A85489FCBA68183A6A0F38EB91338@ORSMSX116.amr.corp.intel.com>
	<6aee0e7c-76a5-a920-7099-a3edc349f205@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F38EB92A1C@ORSMSX116.amr.corp.intel.com>
	<4af19c5d-9a7f-d18b-820b-6f3664b8183a@oracle.com>
	<7de8489c-943b-5ecf-48c1-0bffad101070@oracle.com>
Message-ID: <F89640DCD01A85489FCBA68183A6A0F38EB957C3@ORSMSX116.amr.corp.intel.com>

Hi Vladimir,

The patch only touches code in _WIN64. I am having hard time to understand why the test fails for 32-bit Linux 

Btw, that test passes on Windows 64 platform. I am planning to test on Linux too.

Thanks
Kishor

-----Original Message-----
From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] 
Sent: Tuesday, September 6, 2016 2:31 PM
To: Kharbas, Kishor <kishor.kharbas at intel.com>; hotspot-compiler-dev at openjdk.java.net
Subject: Re: RFR(M) 8078122 : YMM registers upper 128 bits may get clobbered by a JNI call on windows

Next jtreg test failed on 32-bit Linux:

hotspot/test/compiler/runtime/Test7196199.java

----------System.err:(57/2416)----------
test_incrc: [41] = 8.081506E20 != 150000.0
test_incrc: [42] = 1.8632992E31 != 150000.0
test_incrc: [43] = 2.8397877E29 != 150000.0 ...

https://bugs.openjdk.java.net/browse/JDK-7196199

was related to Upper bits (64-255) of XMM (YMM) registers are not saved/restored in interrupt handle code during safepoint.

Looks like your changes are not enough.

Vladimir


On 9/6/16 10:12 AM, Vladimir Kozlov wrote:
> Good. I start testing these changes. I will push it if testing pass.
>
> Thanks,
> Vladimir
>
> On 9/2/16 3:07 PM, Kharbas, Kishor wrote:
>> Thanks Vladimir,
>>
>> I have updated the patch :
>> http://cr.openjdk.java.net/~vdeshpande/8078122/webrev.02/
>>
>> I looked for other places in src/cpu/x86/vm. I feel every case is 
>> covered.
>>
>> - Kishor
>>
>> -----Original Message-----
>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>> Sent: Thursday, September 1, 2016 11:39 AM
>> To: Kharbas, Kishor <kishor.kharbas at intel.com>; 
>> hotspot-compiler-dev at openjdk.java.net
>> Subject: Re: RFR(M) 8078122 : YMM registers upper 128 bits may get 
>> clobbered by a JNI call on windows
>>
>> Good. But looks like some code relied on old stack layout in stubs, 
>> for example sha256_AVX2():
>>
>> #ifndef _WIN64
>>    _XMM_SAVE_SIZE = 0,
>> #else
>>    _XMM_SAVE_SIZE = 8*16,
>> #endif
>>
>> Please, check that all other related code is fixed too. (I looked on 
>> all cases of _WIN64 in src/cpu/x86/vm/).
>>
>> Thanks,
>> Vladimir
>>
>> On 8/31/16 10:17 PM, Kharbas, Kishor wrote:
>>> Hello,
>>>
>>> I removed the unwanted save and restore of registers in the range
>>> XMM6-XMM31 from the x64_64 stubs.
>>> I also removed the #ifdef _WIN64 block from x86.ad file.
>>>
>>> Link to the new patch :
>>> http://cr.openjdk.java.net/~vdeshpande/8078122/webrev.01/
>>>
>>> Thanks
>>> Kishor
>>>
>>>
>>> -----Original Message-----
>>> From: Kharbas, Kishor
>>> Sent: Wednesday, August 24, 2016 6:24 PM
>>> To: Vladimir Kozlov <vladimir.kozlov at oracle.com>; 
>>> hotspot-compiler-dev at openjdk.java.net
>>> Cc: Kharbas, Kishor <kishor.kharbas at intel.com>
>>> Subject: RE: RFR(M) 8078122 : YMM registers upper 128 bits may get 
>>> clobbered by a JNI call on windows
>>>
>>> Thanks Vladimir for quick feedback.
>>> I will look into the stubs which save the registers in the range 
>>> XMM6-XMM31. Also the first comment makes perfect sense.
>>>
>>> Thanks
>>> Kishor
>>>
>>> -----Original Message-----
>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>> Sent: Wednesday, August 24, 2016 3:08 PM
>>> To: Kharbas, Kishor <kishor.kharbas at intel.com>; 
>>> hotspot-compiler-dev at openjdk.java.net
>>> Subject: Re: RFR(M) 8078122 : YMM registers upper 128 bits may get 
>>> clobbered by a JNI call on windows
>>>
>>> Hi Kishor,
>>>
>>> First, #ifdef _WIN64 is not needed anymore since calling convention 
>>> is similat to unix now.
>>>
>>> Second, I would like you to look more broadly. With this change we 
>>> don't need to preserve XMM6-XMM31 in our stubs for WIN64. I am not 
>>> sure that we can remove all #ifdef _WIN64 there but for most of them 
>>> I think we can do. Please, look.
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 8/24/16 2:40 PM, Kharbas, Kishor wrote:
>>>> Requesting the community to review the patch for
>>>> https://bugs.openjdk.java.net/browse/JDK-8078122
>>>>
>>>> Webrev : http://cr.openjdk.java.net/~vdeshpande/8078122/webrev.00
>>>>
>>>> The patch changes the definitions of registers XMM6-XMM31 for WIN64.
>>>>
>>>> Thank you.
>>>>
>>>> Kishor
>>>>

From goetz.lindenmaier at sap.com  Wed Sep  7 10:07:57 2016
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Wed, 7 Sep 2016 10:07:57 +0000
Subject: [9] RFR (M): 8160543: C1: Crash in java.lang.String.indexOf in
	some	java.sql tests
In-Reply-To: <57C6EA3F.1060702@oracle.com>
References: <57C6EA3F.1060702@oracle.com>
Message-ID: <a86f896cf0a9417fbfae48dbc2c041a4@DEWDFE13DE50.global.corp.sap>

Hi Nils, 

I encountered this issue in our nightly jck test runs with -Xcomp.
I applied your fix to the VM tested, and I can no more observe 
the error.

Also, I had a look at the code. It looks good.
The if around assert(klass->is_loaded(), "sanity"); could be merged into the assertion.
Also, if this holds, a row of calls to klass->is_loaded() can be removed.

Please fix the indentation in c1_GraphBuilder.cpp 2056ff.

Thanks for fixing this,
  Goetz.

> -----Original Message-----
> From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-
> bounces at openjdk.java.net] On Behalf Of Nils Eliasson
> Sent: Mittwoch, 31. August 2016 16:31
> To: hotspot-compiler-dev at openjdk.java.net compiler <hotspot-compiler-
> dev at openjdk.java.net>
> Subject: [9] RFR (M): 8160543: C1: Crash in java.lang.String.indexOf in some
> java.sql tests
> 
> Hi,
> 
> This is fixes for bug [1] JDK-8160543 "C1: Crash in java.lang.String.indexOf in
> some java.sql tests" and [2] JDK-8154172 "NPE is thrown instead of linkage
> error when invoking nonexistent method "
> 
> * Description:
> 
> Problem in bug #2: A method that is not loaded must not have null check at
> the call. The unloaded method may not exist and then we may throw a NPE
> on a null receiver before LinkageError and violate the VM spec.
> 
> Problem in bug #1: A final method that is not loaded at compile time (the
> final-property is unknown), but is actually loaded from another classloader
> (and may already be compiled) must null check its receiver. The null check
> can not be at the call site since it would violate #2. Instead the call will have to
> use the target methods unverified entry point.
> 
> An additional problem i encountered is that profiling requires a null check,
> but if the method isn't loaded we can't add one. So we can't profile unloaded
> methods.
> 
> The solution to these problems shouldn't introduce any regression in the
> normal use case. Unloaded methods is only common in the compiler when
> using -Xcomp when the interpreter hasn't made sure everything is loaded. I
> have made the trade-off that it is acceptable to have an performance
> regression in the -Xcomp case in order to meet the VM specification.
> 
> * Summary of changes:
> 
> hotspot/src/share/vm/code/compiledIC.cpp
> 
> -    if (static_bound || is_optimized) {
> +    if (is_optimized) {
> 
> static_bound is true if the method at resolve time is declared final. This is
> uninteresting, we need to know if the call was known final at compile time.
> is_optimized however is only true if the targets was loaded and was final at
> compile time. This change makes sure that we get a call to the unverified
> entry point if there was no null check emitted.
> 
> hotspot/src/share/vm/c1/c1_GraphBuilder.cpp
> Contains both changed and some simplifications. The is_resolved method
> has been exploded, and redundant check was removed. The major change is
> where we decide if a null check should be emitted and when profiling can be
> added.
> 
> * Testing
> 
> These are some useful test I have run with and without -Xcomp and with and
> without -XX:TieredStopAtLevel=1:
> - jdk/test/java/sql/testng/test/sql/CallableStatementTests.java (for bug #1)
> - JCK/BINC (binc02908m01 for bug #2 and all binc0500)
> - hotspot/test/compiler/linkage/LinkageErrors.java
> 
> I will await complete runs of hs-comp tier 0 - 5 before checkin.
> 
> Please review,
> 
> Regards,
> Nils Eliasson
> 
> 
> 
> Webrev: http://cr.openjdk.java.net/~neliasso/8160543/webrev.09/
> Bug [1]: https://bugs.openjdk.java.net/browse/JDK-8160543
> Bug [2]: https://bugs.openjdk.java.net/browse/JDK-8160383


From doug.simon at oracle.com  Wed Sep  7 10:34:08 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Wed, 7 Sep 2016 12:34:08 +0200
Subject: JVMCI compiler thread idle state is RUNNABLE
In-Reply-To: <959DA194-390F-49B2-97FA-CE402CA9D03D@twitter.com>
References: <959DA194-390F-49B2-97FA-CE402CA9D03D@twitter.com>
Message-ID: <7C8D02B7-0069-4D1C-BF47-D435F443EBD9@oracle.com>


> On 06 Sep 2016, at 23:39, Christian Thalinger <cthalinger at twitter.com> wrote:
> 
> One thing we noticed here at Twitter is that JVMCI threads are not hidden (is_hidden_from_external_view) but at the same time they show up as always active.
> 
> I don?t know the history here but I?m speculating that since compiler threads were always hidden no-one bothered.  In the SIGQUIT thread dump compiler threads show up as RUNNABLE:
> 
> "C1 CompilerThread3" #8 daemon prio=9 os_prio=31 tid=0x00007fdcc2016800 nid=0x5103 waiting on condition [0x0000000000000000]
>    java.lang.Thread.State: RUNNABLE
> 
> "C2 CompilerThread2" #7 daemon prio=9 os_prio=31 tid=0x00007fdcc2821800 nid=0x4f03 waiting on condition [0x0000000000000000]
>    java.lang.Thread.State: RUNNABLE
> 
> The specification of RUNNABLE is:
> 
>         /**
>          * Thread state for a runnable thread.  A thread in the runnable
>          * state is executing in the Java virtual machine but it may
>          * be waiting for other resources from the operating system
>          * such as processor.
>          */
>         RUNNABLE,
> 
> and that makes sense.  But this is very confusing to the user (as one of our internal users reported to me).  Maybe JVMCI threads should just be hidden, too?

Why is this very confusing? All sorts of non-app threads show up in a SIGQUIT thread dump don?t they?

Not sure if this is covered by your proposal/question but I don?t think JVMCI compiler threads should be hidden from JVMTI otherwise they could never be debugged by a Java IDE.

-Doug

From doug.simon at oracle.com  Wed Sep  7 12:29:20 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Wed, 7 Sep 2016 14:29:20 +0200
Subject: RFR: 8165434: [JVMCI] remove uses of setAccessible
In-Reply-To: <864558C5-C2AD-4D6B-BB6F-568F00BBE28A@twitter.com>
References: <BA873BD3-C571-43B2-B41F-EE9FA35FE926@oracle.com>
	<864558C5-C2AD-4D6B-BB6F-568F00BBE28A@twitter.com>
Message-ID: <6224CDA0-63E6-442C-BD13-732208FA75A2@oracle.com>


> On 06 Sep 2016, at 20:12, Christian Thalinger <cthalinger at twitter.com> wrote:
> 
> 
>> On Sep 5, 2016, at 6:45 AM, Doug Simon <doug.simon at oracle.com> wrote:
>> 
>> JVMCI currently uses java.lang.reflect.AccessibleObject.setAccessible to get at private internals of certain JDK objects (e.g. java.lang.reflect.Method::slot). In light of changes around java.lang.reflect.AccessibleObject::setAccessible at http://openjdk.java.net/jeps/261, this may require extra command line options at some point. To avoid that, I?ve removed all uses of setAccessible in JVMCI.
>> 
>> http://cr.openjdk.java.net/~dnsimon/8165434/
> 
> src/jdk.vm.ci/share/classes/jdk.vm.ci.meta/src/jdk/vm/ci/meta/ModifiersProvider.java
> 
> +    int BRIDGE = 0x0040;
> +    int VARARGS = 0x0080;
> +    int SYNTHETIC = 0x1000;
> +    int ANNOTATION = 0x2000;
> +    int ENUM = 0x4000;
> I wish we could avoid that.  We can?t use this stuff because it?s HotSpot-dependent, right?
> +        assert ModifiersProvider.SYNTHETIC == getConstant("JVM_ACC_SYNTHETIC", Integer.class);
> +        assert ModifiersProvider.ANNOTATION == getConstant("JVM_ACC_ANNOTATION", Integer.class);
> +        assert ModifiersProvider.BRIDGE == getConstant("JVM_ACC_BRIDGE", Integer.class);
> +        assert ModifiersProvider.VARARGS == getConstant("JVM_ACC_VARARGS", Integer.class);
> +        assert ModifiersProvider.ENUM == getConstant("JVM_ACC_ENUM", Integer.class);
> What if we convert these constants to interface methods and the VM-dependent part has to implement them?  Or maybe even keep the fields and assign them via interface methods.

Following your suggestion, I?ve factored out these VM dependent flags to a new HotSpotModifiers class:

http://cr.openjdk.java.net/~dnsimon/8165434.v2/

> src/share/vm/jvmci/vmStructs_jvmci.cpp
> 
>    declare_constant(JVM_ACC_FIELD_HAS_GENERIC_SIGNATURE)                   \
> 
> +  declare_preprocessor_constant("JVM_ACC_VARARGS", JVM_ACC_VARARGS)       \
> +  declare_preprocessor_constant("JVM_ACC_BRIDGE", JVM_ACC_BRIDGE)   \
> +  declare_preprocessor_constant("JVM_ACC_ANNOTATION", JVM_ACC_ANNOTATION)   \
> +  declare_preprocessor_constant("JVM_ACC_ENUM", JVM_ACC_ENUM)   \
> 
>    declare_preprocessor_constant("JVM_ACC_SYNTHETIC", JVM_ACC_SYNTHETIC)   \
> 
> Please align the ?\?.

Done.

> 
> Otherwise this looks good and generally a good cleanup.

Thanks for the review.

-Doug

From jamsheed.c.m at oracle.com  Wed Sep  7 14:14:15 2016
From: jamsheed.c.m at oracle.com (Jamsheed C m)
Date: Wed, 7 Sep 2016 19:44:15 +0530
Subject: RFR: 8164508: unexpected profiling mismatch in c1 generated code
In-Reply-To: <3ace86b5-ff91-7d2c-9b74-e4b9c497365c@oracle.com>
References: <6d873f34-2f65-e96f-0189-ca8af07ae824@oracle.com>
	<3ace86b5-ff91-7d2c-9b74-e4b9c497365c@oracle.com>
Message-ID: <2c802d19-5c80-afb1-cd24-943bced4d5ff@oracle.com>

Thank you, Vladimir.

Best Regards,

Jamsheed


On 9/6/2016 9:38 PM, Vladimir Kozlov wrote:
> Good.
>
> thanks,
> Vladimir
>
> On 9/5/16 12:53 AM, Jamsheed C m wrote:
>> Hi,
>>
>> webrev: http://cr.openjdk.java.net/~jcm/8164508/webrev.00/
>>
>> bug id: https://bugs.openjdk.java.net/browse/JDK-8164508
>>
>>
>> we were skipping profiling of  first argument(recv)  for virtual call
>> sites to static callee.  this was not done for non-inline case in c1.
>> (see linked case for ref: 
>> https://bugs.openjdk.java.net/browse/JDK-8027631)
>>
>> - bool has_receiver = x->inlined() && !x->callee()->is_static() &&
>> !Bytecodes::has_receiver(bc);
>> + bool has_receiver = x->callee()->is_loaded() &&
>> !x->callee()->is_static() && !Bytecodes::has_receiver(bc); above change
>> is not absolutely necessary as this can happen only for
>> _linkToVirtual,_linkToInterface sites inlining at present, and linker
>> elimination and callee inlining always happen together in c1. Please
>> review, Best Regards, Jamsheed
>>
>>
>>


From michael.c.berg at intel.com  Wed Sep  7 16:57:45 2016
From: michael.c.berg at intel.com (Berg, Michael C)
Date: Wed, 7 Sep 2016 16:57:45 +0000
Subject: CR for RFR 8165565
Message-ID: <C568518E7B433348B114B6A7122D474756F0C9F3@FMSMSX102.amr.corp.intel.com>

Hi Folks,

Some cases of CountedLoopEnd have side effect code on targets like SKX for vector processed post loops that are unsafe to translate to short branch versions. A recent change between b126 and b127 exposes this problem. The simple solution is to not allow short branch mapping for these cases. This produces correct code. A patch will be uploaded shortly to exemplify this case. The failures show up in SPECjvm2008 in the scimark metrics after b127 inclusive for SKX targets.

This code was tested as follows: hotspot jreg, SPECjvm2008 on bdw and skx complete with no issues.

Bug-id: https://bugs.openjdk.java.net/browse/JDK-8165565

webrev:
http://cr.openjdk.java.net/~mcberg/8165565/webrev.01/

Essentially it preserves this ad file pattern on x86 by disallowing branch shortening (this instruction pattern is predicate guarded for skx like platforms):

jmpLoopEnd_and_restoreMask() {
  match(CountedLoopEnd cop cr);
    __ jcc((Assembler::Condition)($cop$$cmpcode), *L, false);
    __ restorevectmask();

which has the restore vector mask side effect code.

Regards,
Michael
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160907/880a6593/attachment-0001.html>

From vladimir.kozlov at oracle.com  Wed Sep  7 17:09:50 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 7 Sep 2016 10:09:50 -0700
Subject: CR for RFR 8165565
In-Reply-To: <C568518E7B433348B114B6A7122D474756F0C9F3@FMSMSX102.amr.corp.intel.com>
References: <C568518E7B433348B114B6A7122D474756F0C9F3@FMSMSX102.amr.corp.intel.com>
Message-ID: <a4875e3c-6612-77f2-7553-c936bda2ef78@oracle.com>

Michael, I think you should instead modify first condition in 
InstructForm::check_branch_variant()

Thanks,
Vladimir

On 9/7/16 9:57 AM, Berg, Michael C wrote:
> Hi Folks,
>
> Some cases of CountedLoopEnd have side effect code on targets like SKX
> for vector processed post loops that are unsafe to translate to short
> branch versions. A recent change between b126 and b127 exposes this
> problem. The simple solution is to not allow short branch mapping for
> these cases. This produces correct code. A patch will be uploaded
> shortly to exemplify this case. The failures show up in SPECjvm2008 in
> the scimark metrics after b127 inclusive for SKX targets.
>
>
>
> This code was tested as follows: hotspot jreg, SPECjvm2008 on bdw and
> skx complete with no issues.
>
>
> Bug-id: https://bugs.openjdk.java.net/browse/JDK-8165565
>
>
> webrev:
>
> http://cr.openjdk.java.net/~mcberg/8165565/webrev.01/
>
>
>
> Essentially it preserves this ad file pattern on x86 by disallowing
> branch shortening (this instruction pattern is predicate guarded for skx
> like platforms):
>
>
>
> jmpLoopEnd_and_restoreMask() {
>   match(CountedLoopEnd cop cr);
>     __ jcc((Assembler::Condition)($cop$$cmpcode), *L, false);
>     __ restorevectmask();
>
>
>
> which has the restore vector mask side effect code.
>
>
>
> Regards,
>
> Michael
>

From cthalinger at twitter.com  Wed Sep  7 17:19:05 2016
From: cthalinger at twitter.com (Christian Thalinger)
Date: Wed, 7 Sep 2016 07:19:05 -1000
Subject: JVMCI compiler thread idle state is RUNNABLE
In-Reply-To: <7C8D02B7-0069-4D1C-BF47-D435F443EBD9@oracle.com>
References: <959DA194-390F-49B2-97FA-CE402CA9D03D@twitter.com>
	<7C8D02B7-0069-4D1C-BF47-D435F443EBD9@oracle.com>
Message-ID: <AB0D082C-5EA5-4F53-A519-7F0B8408F720@twitter.com>


> On Sep 7, 2016, at 12:34 AM, Doug Simon <doug.simon at oracle.com> wrote:
> 
> 
>> On 06 Sep 2016, at 23:39, Christian Thalinger <cthalinger at twitter.com> wrote:
>> 
>> One thing we noticed here at Twitter is that JVMCI threads are not hidden (is_hidden_from_external_view) but at the same time they show up as always active.
>> 
>> I don?t know the history here but I?m speculating that since compiler threads were always hidden no-one bothered.  In the SIGQUIT thread dump compiler threads show up as RUNNABLE:
>> 
>> "C1 CompilerThread3" #8 daemon prio=9 os_prio=31 tid=0x00007fdcc2016800 nid=0x5103 waiting on condition [0x0000000000000000]
>>   java.lang.Thread.State: RUNNABLE
>> 
>> "C2 CompilerThread2" #7 daemon prio=9 os_prio=31 tid=0x00007fdcc2821800 nid=0x4f03 waiting on condition [0x0000000000000000]
>>   java.lang.Thread.State: RUNNABLE
>> 
>> The specification of RUNNABLE is:
>> 
>>        /**
>>         * Thread state for a runnable thread.  A thread in the runnable
>>         * state is executing in the Java virtual machine but it may
>>         * be waiting for other resources from the operating system
>>         * such as processor.
>>         */
>>        RUNNABLE,
>> 
>> and that makes sense.  But this is very confusing to the user (as one of our internal users reported to me).  Maybe JVMCI threads should just be hidden, too?
> 
> Why is this very confusing? All sorts of non-app threads show up in a SIGQUIT thread dump don?t they?

The confusing part is that JVMCI threads show up but C1/C2 threads don?t.  Oh, maybe I wasn?t clear enough.  I?m not talking about a SIGQUIT thread dump; I?m talking about the thread list you can get in Java code (JVM_GetAllThreads aka. ?external view?).

> 
> Not sure if this is covered by your proposal/question but I don?t think JVMCI compiler threads should be hidden from JVMTI otherwise they could never be debugged by a Java IDE.

I don?t know which threads JVMTI sees but I would assume it?s not using the ?external view?.

> 
> -Doug


From doug.simon at oracle.com  Wed Sep  7 17:32:02 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Wed, 7 Sep 2016 19:32:02 +0200
Subject: JVMCI compiler thread idle state is RUNNABLE
In-Reply-To: <AB0D082C-5EA5-4F53-A519-7F0B8408F720@twitter.com>
References: <959DA194-390F-49B2-97FA-CE402CA9D03D@twitter.com>
	<7C8D02B7-0069-4D1C-BF47-D435F443EBD9@oracle.com>
	<AB0D082C-5EA5-4F53-A519-7F0B8408F720@twitter.com>
Message-ID: <9ACAF625-ABE6-40DC-AECB-E57E17659AD8@oracle.com>


> On 07 Sep 2016, at 19:19, Christian Thalinger <cthalinger at twitter.com> wrote:
> 
>> 
>> On Sep 7, 2016, at 12:34 AM, Doug Simon <doug.simon at oracle.com> wrote:
>> 
>> 
>>> On 06 Sep 2016, at 23:39, Christian Thalinger <cthalinger at twitter.com> wrote:
>>> 
>>> One thing we noticed here at Twitter is that JVMCI threads are not hidden (is_hidden_from_external_view) but at the same time they show up as always active.
>>> 
>>> I don?t know the history here but I?m speculating that since compiler threads were always hidden no-one bothered.  In the SIGQUIT thread dump compiler threads show up as RUNNABLE:
>>> 
>>> "C1 CompilerThread3" #8 daemon prio=9 os_prio=31 tid=0x00007fdcc2016800 nid=0x5103 waiting on condition [0x0000000000000000]
>>>  java.lang.Thread.State: RUNNABLE
>>> 
>>> "C2 CompilerThread2" #7 daemon prio=9 os_prio=31 tid=0x00007fdcc2821800 nid=0x4f03 waiting on condition [0x0000000000000000]
>>>  java.lang.Thread.State: RUNNABLE
>>> 
>>> The specification of RUNNABLE is:
>>> 
>>>       /**
>>>        * Thread state for a runnable thread.  A thread in the runnable
>>>        * state is executing in the Java virtual machine but it may
>>>        * be waiting for other resources from the operating system
>>>        * such as processor.
>>>        */
>>>       RUNNABLE,
>>> 
>>> and that makes sense.  But this is very confusing to the user (as one of our internal users reported to me).  Maybe JVMCI threads should just be hidden, too?
>> 
>> Why is this very confusing? All sorts of non-app threads show up in a SIGQUIT thread dump don?t they?
> 
> The confusing part is that JVMCI threads show up but C1/C2 threads don?t.  Oh, maybe I wasn?t clear enough.  I?m not talking about a SIGQUIT thread dump; I?m talking about the thread list you can get in Java code (JVM_GetAllThreads aka. ?external view?).

I don?t know what exactly is meant by ?external view? but I see that a bunch of JVMTI methods use is_hidden_from_external_view which, as you observed, returns false for JVMCI compiler threads. That seems to imply that changing the returned value for JVMCI threads will probably make such threads disappear in a Java debugger. Maybe all uses of is_hidden_from_external_view need to make their own decision on whether or not to include CompilerThreads that can_call_java.

-Doug

From cthalinger at twitter.com  Wed Sep  7 17:52:19 2016
From: cthalinger at twitter.com (Christian Thalinger)
Date: Wed, 7 Sep 2016 07:52:19 -1000
Subject: RFR: 8165434: [JVMCI] remove uses of setAccessible
In-Reply-To: <6224CDA0-63E6-442C-BD13-732208FA75A2@oracle.com>
References: <BA873BD3-C571-43B2-B41F-EE9FA35FE926@oracle.com>
	<864558C5-C2AD-4D6B-BB6F-568F00BBE28A@twitter.com>
	<6224CDA0-63E6-442C-BD13-732208FA75A2@oracle.com>
Message-ID: <999A422E-6CF6-45C5-955B-D58745DBB456@twitter.com>


> On Sep 7, 2016, at 2:29 AM, Doug Simon <doug.simon at oracle.com> wrote:
> 
>> 
>> On 06 Sep 2016, at 20:12, Christian Thalinger <cthalinger at twitter.com> wrote:
>> 
>> 
>>> On Sep 5, 2016, at 6:45 AM, Doug Simon <doug.simon at oracle.com> wrote:
>>> 
>>> JVMCI currently uses java.lang.reflect.AccessibleObject.setAccessible to get at private internals of certain JDK objects (e.g. java.lang.reflect.Method::slot). In light of changes around java.lang.reflect.AccessibleObject::setAccessible at http://openjdk.java.net/jeps/261, this may require extra command line options at some point. To avoid that, I?ve removed all uses of setAccessible in JVMCI.
>>> 
>>> http://cr.openjdk.java.net/~dnsimon/8165434/
>> 
>> src/jdk.vm.ci/share/classes/jdk.vm.ci.meta/src/jdk/vm/ci/meta/ModifiersProvider.java
>> 
>> +    int BRIDGE = 0x0040;
>> +    int VARARGS = 0x0080;
>> +    int SYNTHETIC = 0x1000;
>> +    int ANNOTATION = 0x2000;
>> +    int ENUM = 0x4000;
>> I wish we could avoid that.  We can?t use this stuff because it?s HotSpot-dependent, right?
>> +        assert ModifiersProvider.SYNTHETIC == getConstant("JVM_ACC_SYNTHETIC", Integer.class);
>> +        assert ModifiersProvider.ANNOTATION == getConstant("JVM_ACC_ANNOTATION", Integer.class);
>> +        assert ModifiersProvider.BRIDGE == getConstant("JVM_ACC_BRIDGE", Integer.class);
>> +        assert ModifiersProvider.VARARGS == getConstant("JVM_ACC_VARARGS", Integer.class);
>> +        assert ModifiersProvider.ENUM == getConstant("JVM_ACC_ENUM", Integer.class);
>> What if we convert these constants to interface methods and the VM-dependent part has to implement them?  Or maybe even keep the fields and assign them via interface methods.
> 
> Following your suggestion, I?ve factored out these VM dependent flags to a new HotSpotModifiers class:
> 
> http://cr.openjdk.java.net/~dnsimon/8165434.v2/ <http://cr.openjdk.java.net/~dnsimon/8165434.v2/>

Excellent.  One question? I noticed HotSpotModifiers is an interface but no other class implements it.  Is there a reason for it being an interface?

Only nit, remove 2011:
   2  * Copyright (c) 2011, 2016, Oracle and/or its affiliates. All rights reserved.
> 
>> src/share/vm/jvmci/vmStructs_jvmci.cpp
>> 
>>   declare_constant(JVM_ACC_FIELD_HAS_GENERIC_SIGNATURE)                   \
>> 
>> +  declare_preprocessor_constant("JVM_ACC_VARARGS", JVM_ACC_VARARGS)       \
>> +  declare_preprocessor_constant("JVM_ACC_BRIDGE", JVM_ACC_BRIDGE)   \
>> +  declare_preprocessor_constant("JVM_ACC_ANNOTATION", JVM_ACC_ANNOTATION)   \
>> +  declare_preprocessor_constant("JVM_ACC_ENUM", JVM_ACC_ENUM)   \
>> 
>>   declare_preprocessor_constant("JVM_ACC_SYNTHETIC", JVM_ACC_SYNTHETIC)   \
>> 
>> Please align the ?\?.
> 
> Done.

Looks good.

> 
>> 
>> Otherwise this looks good and generally a good cleanup.
> 
> Thanks for the review.
> 
> -Doug

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160907/ee6dbc8e/attachment-0001.html>

From cthalinger at twitter.com  Wed Sep  7 17:59:32 2016
From: cthalinger at twitter.com (Christian Thalinger)
Date: Wed, 7 Sep 2016 07:59:32 -1000
Subject: JVMCI compiler thread idle state is RUNNABLE
In-Reply-To: <9ACAF625-ABE6-40DC-AECB-E57E17659AD8@oracle.com>
References: <959DA194-390F-49B2-97FA-CE402CA9D03D@twitter.com>
	<7C8D02B7-0069-4D1C-BF47-D435F443EBD9@oracle.com>
	<AB0D082C-5EA5-4F53-A519-7F0B8408F720@twitter.com>
	<9ACAF625-ABE6-40DC-AECB-E57E17659AD8@oracle.com>
Message-ID: <31663455-9FF4-4FDF-A33D-02CEBE488D00@twitter.com>


> On Sep 7, 2016, at 7:32 AM, Doug Simon <doug.simon at oracle.com> wrote:
> 
>> 
>> On 07 Sep 2016, at 19:19, Christian Thalinger <cthalinger at twitter.com> wrote:
>> 
>>> 
>>> On Sep 7, 2016, at 12:34 AM, Doug Simon <doug.simon at oracle.com> wrote:
>>> 
>>> 
>>>> On 06 Sep 2016, at 23:39, Christian Thalinger <cthalinger at twitter.com> wrote:
>>>> 
>>>> One thing we noticed here at Twitter is that JVMCI threads are not hidden (is_hidden_from_external_view) but at the same time they show up as always active.
>>>> 
>>>> I don?t know the history here but I?m speculating that since compiler threads were always hidden no-one bothered.  In the SIGQUIT thread dump compiler threads show up as RUNNABLE:
>>>> 
>>>> "C1 CompilerThread3" #8 daemon prio=9 os_prio=31 tid=0x00007fdcc2016800 nid=0x5103 waiting on condition [0x0000000000000000]
>>>> java.lang.Thread.State: RUNNABLE
>>>> 
>>>> "C2 CompilerThread2" #7 daemon prio=9 os_prio=31 tid=0x00007fdcc2821800 nid=0x4f03 waiting on condition [0x0000000000000000]
>>>> java.lang.Thread.State: RUNNABLE
>>>> 
>>>> The specification of RUNNABLE is:
>>>> 
>>>>      /**
>>>>       * Thread state for a runnable thread.  A thread in the runnable
>>>>       * state is executing in the Java virtual machine but it may
>>>>       * be waiting for other resources from the operating system
>>>>       * such as processor.
>>>>       */
>>>>      RUNNABLE,
>>>> 
>>>> and that makes sense.  But this is very confusing to the user (as one of our internal users reported to me). Maybe JVMCI threads should just be hidden, too?
>>> 
>>> Why is this very confusing? All sorts of non-app threads show up in a SIGQUIT thread dump don?t they?
>> 
>> The confusing part is that JVMCI threads show up but C1/C2 threads don?t.  Oh, maybe I wasn?t clear enough.  I?m not talking about a SIGQUIT thread dump; I?m talking about the thread list you can get in Java code (JVM_GetAllThreads aka. ?external view?).
> 
> I don?t know what exactly is meant by ?external view? but I see that a bunch of JVMTI methods use is_hidden_from_external_view which, as you observed, returns false for JVMCI compiler threads. That seems to imply that changing the returned value for JVMCI threads will probably make such threads disappear in a Java debugger.

Yes, I?ve seen the code in JVMTI too.  We need a JVMTI expert.

> Maybe all uses of is_hidden_from_external_view need to make their own decision on whether or not to include CompilerThreads that can_call_java.

That would be suboptimal but from what I can see that might be the only way.  I think we need another method next to is_hidden_from_external_view that distinguishes between native compiler threads and JVMCI threads.  can_call_java might be that method but it has a horrible name.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160907/ec6633c1/attachment.html>

From michael.c.berg at intel.com  Wed Sep  7 18:04:02 2016
From: michael.c.berg at intel.com (Berg, Michael C)
Date: Wed, 7 Sep 2016 18:04:02 +0000
Subject: CR for RFR 8165565
In-Reply-To: <a4875e3c-6612-77f2-7553-c936bda2ef78@oracle.com>
References: <C568518E7B433348B114B6A7122D474756F0C9F3@FMSMSX102.amr.corp.intel.com>
	<a4875e3c-6612-77f2-7553-c936bda2ef78@oracle.com>
Message-ID: <C568518E7B433348B114B6A7122D474756F0CA6D@FMSMSX102.amr.corp.intel.com>

Vladimir please see the latest webrev:

http://cr.openjdk.java.net/~mcberg/8165565/webrev.02a/

It has the change for check_branch_variant.
I have tested it out and it works as advertised.

Regards,
Michael


-----Original Message-----
From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] 
Sent: Wednesday, September 07, 2016 10:10 AM
To: Berg, Michael C <michael.c.berg at intel.com>; 'hotspot-compiler-dev at openjdk.java.net' <hotspot-compiler-dev at openjdk.java.net>
Subject: Re: CR for RFR 8165565

Michael, I think you should instead modify first condition in
InstructForm::check_branch_variant()

Thanks,
Vladimir

On 9/7/16 9:57 AM, Berg, Michael C wrote:
> Hi Folks,
>
> Some cases of CountedLoopEnd have side effect code on targets like SKX 
> for vector processed post loops that are unsafe to translate to short 
> branch versions. A recent change between b126 and b127 exposes this 
> problem. The simple solution is to not allow short branch mapping for 
> these cases. This produces correct code. A patch will be uploaded 
> shortly to exemplify this case. The failures show up in SPECjvm2008 in 
> the scimark metrics after b127 inclusive for SKX targets.
>
>
>
> This code was tested as follows: hotspot jreg, SPECjvm2008 on bdw and 
> skx complete with no issues.
>
>
> Bug-id: https://bugs.openjdk.java.net/browse/JDK-8165565
>
>
> webrev:
>
> http://cr.openjdk.java.net/~mcberg/8165565/webrev.01/
>
>
>
> Essentially it preserves this ad file pattern on x86 by disallowing 
> branch shortening (this instruction pattern is predicate guarded for 
> skx like platforms):
>
>
>
> jmpLoopEnd_and_restoreMask() {
>   match(CountedLoopEnd cop cr);
>     __ jcc((Assembler::Condition)($cop$$cmpcode), *L, false);
>     __ restorevectmask();
>
>
>
> which has the restore vector mask side effect code.
>
>
>
> Regards,
>
> Michael
>

From vladimir.kozlov at oracle.com  Wed Sep  7 18:05:20 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 7 Sep 2016 11:05:20 -0700
Subject: CR for RFR 8165565
In-Reply-To: <C568518E7B433348B114B6A7122D474756F0CA6D@FMSMSX102.amr.corp.intel.com>
References: <C568518E7B433348B114B6A7122D474756F0C9F3@FMSMSX102.amr.corp.intel.com>
	<a4875e3c-6612-77f2-7553-c936bda2ef78@oracle.com>
	<C568518E7B433348B114B6A7122D474756F0CA6D@FMSMSX102.amr.corp.intel.com>
Message-ID: <82c35d8b-b85a-2595-d6f8-849eaf619c16@oracle.com>

Looks good.

Thanks,
Vladimir

On 9/7/16 11:04 AM, Berg, Michael C wrote:
> Vladimir please see the latest webrev:
>
> http://cr.openjdk.java.net/~mcberg/8165565/webrev.02a/
>
> It has the change for check_branch_variant.
> I have tested it out and it works as advertised.
>
> Regards,
> Michael
>
>
> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Wednesday, September 07, 2016 10:10 AM
> To: Berg, Michael C <michael.c.berg at intel.com>; 'hotspot-compiler-dev at openjdk.java.net' <hotspot-compiler-dev at openjdk.java.net>
> Subject: Re: CR for RFR 8165565
>
> Michael, I think you should instead modify first condition in
> InstructForm::check_branch_variant()
>
> Thanks,
> Vladimir
>
> On 9/7/16 9:57 AM, Berg, Michael C wrote:
>> Hi Folks,
>>
>> Some cases of CountedLoopEnd have side effect code on targets like SKX
>> for vector processed post loops that are unsafe to translate to short
>> branch versions. A recent change between b126 and b127 exposes this
>> problem. The simple solution is to not allow short branch mapping for
>> these cases. This produces correct code. A patch will be uploaded
>> shortly to exemplify this case. The failures show up in SPECjvm2008 in
>> the scimark metrics after b127 inclusive for SKX targets.
>>
>>
>>
>> This code was tested as follows: hotspot jreg, SPECjvm2008 on bdw and
>> skx complete with no issues.
>>
>>
>> Bug-id: https://bugs.openjdk.java.net/browse/JDK-8165565
>>
>>
>> webrev:
>>
>> http://cr.openjdk.java.net/~mcberg/8165565/webrev.01/
>>
>>
>>
>> Essentially it preserves this ad file pattern on x86 by disallowing
>> branch shortening (this instruction pattern is predicate guarded for
>> skx like platforms):
>>
>>
>>
>> jmpLoopEnd_and_restoreMask() {
>>   match(CountedLoopEnd cop cr);
>>     __ jcc((Assembler::Condition)($cop$$cmpcode), *L, false);
>>     __ restorevectmask();
>>
>>
>>
>> which has the restore vector mask side effect code.
>>
>>
>>
>> Regards,
>>
>> Michael
>>

From jamsheed.c.m at oracle.com  Thu Sep  8 09:56:14 2016
From: jamsheed.c.m at oracle.com (Jamsheed C m)
Date: Thu, 8 Sep 2016 15:26:14 +0530
Subject: RFR: 8134389: Crash in HotSpot with jvm.dll+0x42b48
	ciObjectFactory::create_new_metadata
Message-ID: <bad02f78-e54b-b61e-72db-f435c02c5e10@oracle.com>

Hi All,

bugid: https://bugs.openjdk.java.net/browse/JDK-8134389

webrev: http://cr.openjdk.java.net/~jcm/8134389/webrev.00/

return type information is not available in lforms, this causes 
contradictions in operation like  store indexed. mh _linkTo* site arg 
type casting. etc..

fix: TypeCast to declared return type at lform return.

Best Regards,

Jamsheed


From vitalyd at gmail.com  Thu Sep  8 10:38:46 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Thu, 08 Sep 2016 10:38:46 +0000
Subject: MaxBCEAEstimateSize and inlining clarification
Message-ID: <CAHjP37Fjf878iKxCi-ncuqd3CbGdk8wqfbN+162Edph8ajBPCg@mail.gmail.com>

Hi guys,

I'm hoping someone could clarify how MaxBCEAEstimateSize interacts with
inlining.

The default max size is 150, nearly half the size of FreqInlineSize. Is EA
eligibility performed on a method before it's inlined then? I can't imagine
that 150 is the limit after inlining. If it's before inlining, how exactly
does this work after the method is inlined since the inlined call graph may
have quite a bit of code and thus EA may take a while? My understanding is
EA is run after inlining to maximize its effectiveness. Or is the
MaxBCEAEstimateLevel used as pseudo inlining for the analysis?

I'm seeing some code that iterates over a ConcurrentHashMap's entrySet that
allocates tens of GB of CHM$MapEntry objects even though they don't escape.
I'm also seeing some other places where EA ought to be kicking in but
isn't. So I'd like to understand the nuances of it a bit better.

Thanks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160908/b0d8a6b1/attachment-0001.html>

From rwestrel at redhat.com  Thu Sep  8 12:25:13 2016
From: rwestrel at redhat.com (Roland Westrelin)
Date: Thu, 08 Sep 2016 14:25:13 +0200
Subject: MaxBCEAEstimateSize and inlining clarification
In-Reply-To: <CAHjP37Fjf878iKxCi-ncuqd3CbGdk8wqfbN+162Edph8ajBPCg@mail.gmail.com>
References: <CAHjP37Fjf878iKxCi-ncuqd3CbGdk8wqfbN+162Edph8ajBPCg@mail.gmail.com>
Message-ID: <dk6fupaino6.fsf@rwestrel.remote.csb>


> I'm hoping someone could clarify how MaxBCEAEstimateSize interacts with
> inlining.
>
> The default max size is 150, nearly half the size of FreqInlineSize. Is EA
> eligibility performed on a method before it's inlined then? I can't imagine
> that 150 is the limit after inlining. If it's before inlining, how exactly
> does this work after the method is inlined since the inlined call graph may
> have quite a bit of code and thus EA may take a while? My understanding is
> EA is run after inlining to maximize its effectiveness. Or is the
> MaxBCEAEstimateLevel used as pseudo inlining for the analysis?

EA happens after inlining. For calls that are not inlined, the bytecodes
of the callees is analyzed to find more opportunities for
EA. MaxBCEAEstimateSize affects the pass that operates on bytecodes of
non inlined methods. 

Roland.

From vitalyd at gmail.com  Thu Sep  8 12:31:56 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Thu, 8 Sep 2016 08:31:56 -0400
Subject: MaxBCEAEstimateSize and inlining clarification
In-Reply-To: <dk6fupaino6.fsf@rwestrel.remote.csb>
References: <CAHjP37Fjf878iKxCi-ncuqd3CbGdk8wqfbN+162Edph8ajBPCg@mail.gmail.com>
	<dk6fupaino6.fsf@rwestrel.remote.csb>
Message-ID: <CAHjP37GUxDu0GGLY+G0w8sh1rg7FrYZ5AuiG2=r8qVvy_7nyhQ@mail.gmail.com>

Hi Roland,

Thanks for the quick reply.

On Thu, Sep 8, 2016 at 8:25 AM, Roland Westrelin <rwestrel at redhat.com>
wrote:

>
> > I'm hoping someone could clarify how MaxBCEAEstimateSize interacts with
> > inlining.
> >
> > The default max size is 150, nearly half the size of FreqInlineSize. Is
> EA
> > eligibility performed on a method before it's inlined then? I can't
> imagine
> > that 150 is the limit after inlining. If it's before inlining, how
> exactly
> > does this work after the method is inlined since the inlined call graph
> may
> > have quite a bit of code and thus EA may take a while? My understanding
> is
> > EA is run after inlining to maximize its effectiveness. Or is the
> > MaxBCEAEstimateLevel used as pseudo inlining for the analysis?
>
> EA happens after inlining. For calls that are not inlined, the bytecodes
> of the callees is analyzed to find more opportunities for
> EA. MaxBCEAEstimateSize affects the pass that operates on bytecodes of
> non inlined methods.
>
Ok, I see - so this flag has no bearing on a method that's inlined.
Great.  Are there any other conditions/flags that may prevent EA from
running? I'm talking about things other than an object escaping in some
paths (i.e. the control flow insensitive EA as implemented in C2) or the
ordering of EA vs loop unrolling (as I came to find out a few months ago on
this list).

Are OSR compilations performed with EA?

>
> Roland.
>
Thanks again.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160908/a7787b08/attachment.html>

From vladimir.x.ivanov at oracle.com  Thu Sep  8 12:38:35 2016
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Thu, 8 Sep 2016 15:38:35 +0300
Subject: MaxBCEAEstimateSize and inlining clarification
In-Reply-To: <CAHjP37Fjf878iKxCi-ncuqd3CbGdk8wqfbN+162Edph8ajBPCg@mail.gmail.com>
References: <CAHjP37Fjf878iKxCi-ncuqd3CbGdk8wqfbN+162Edph8ajBPCg@mail.gmail.com>
Message-ID: <9a143950-c069-7cae-0c52-d16aae4fe8fe@oracle.com>

Vitaly,

> The default max size is 150, nearly half the size of FreqInlineSize. Is
> EA eligibility performed on a method before it's inlined then? I can't
> imagine that 150 is the limit after inlining. If it's before inlining,
> how exactly does this work after the method is inlined since the inlined
> call graph may have quite a bit of code and thus EA may take a while? My
> understanding is EA is run after inlining to maximize its effectiveness.
> Or is the MaxBCEAEstimateLevel used as pseudo inlining for the analysis?

Yes, it's sort of "pseudo inlining". EA happens after inlining is over 
(both parse & post-parse phases). For calls with known target, EA 
performs static analysis to compute escape info for arguments. It 
happens for methods smaller than MaxBCEAEstimateSize. 
MaxBCEAEstimateLevel limits the inlining depth during analysis.

> I'm seeing some code that iterates over a ConcurrentHashMap's entrySet
> that allocates tens of GB of CHM$MapEntry objects even though they don't
> escape. I'm also seeing some other places where EA ought to be kicking
> in but isn't. So I'd like to understand the nuances of it a bit better.

I wish -XX:+PrintEscapeAnalysis & -XX:+PrintEliminateAllocations were 
available in product binaries, but they aren't, unfortunately. You can 
build an "optimized" JVM though. It's close to product binaries w.r.t. 
speed, but contains also provides most of diagnostic logic (e.g., all 
nonproduct flags are available).

If autoboxing is involved, you can try -XX:+AggressiveUnboxing.

Best regards,
Vladimir Ivanov

From vitalyd at gmail.com  Thu Sep  8 13:07:17 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Thu, 8 Sep 2016 09:07:17 -0400
Subject: MaxBCEAEstimateSize and inlining clarification
In-Reply-To: <9a143950-c069-7cae-0c52-d16aae4fe8fe@oracle.com>
References: <CAHjP37Fjf878iKxCi-ncuqd3CbGdk8wqfbN+162Edph8ajBPCg@mail.gmail.com>
	<9a143950-c069-7cae-0c52-d16aae4fe8fe@oracle.com>
Message-ID: <CAHjP37FE5qJQaT=M7dywGMWS_c35grvhANRzHpqb67D-KT1Wgw@mail.gmail.com>

Hi Vladimir,

On Thu, Sep 8, 2016 at 8:38 AM, Vladimir Ivanov <
vladimir.x.ivanov at oracle.com> wrote:

> Vitaly,
>
> The default max size is 150, nearly half the size of FreqInlineSize. Is
>> EA eligibility performed on a method before it's inlined then? I can't
>> imagine that 150 is the limit after inlining. If it's before inlining,
>> how exactly does this work after the method is inlined since the inlined
>> call graph may have quite a bit of code and thus EA may take a while? My
>> understanding is EA is run after inlining to maximize its effectiveness.
>> Or is the MaxBCEAEstimateLevel used as pseudo inlining for the analysis?
>>
>
> Yes, it's sort of "pseudo inlining". EA happens after inlining is over
> (both parse & post-parse phases). For calls with known target, EA performs
> static analysis to compute escape info for arguments. It happens for
> methods smaller than MaxBCEAEstimateSize. MaxBCEAEstimateLevel limits the
> inlining depth during analysis.

By "known target", does that take profiling into account or it has to be
statically known? But basically, it sounds like this is what Roland said --
any methods not inlined for whatever reason (not hot enough, too big, etc)
are also inspected for EA purposes, but with the MaxBCEAEstimateSize and
Level limits.

>
>
> I'm seeing some code that iterates over a ConcurrentHashMap's entrySet
>> that allocates tens of GB of CHM$MapEntry objects even though they don't
>> escape. I'm also seeing some other places where EA ought to be kicking
>> in but isn't. So I'd like to understand the nuances of it a bit better.
>>
>
> I wish -XX:+PrintEscapeAnalysis & -XX:+PrintEliminateAllocations were
> available in product binaries, but they aren't, unfortunately.

Yes, that would be great! Is there a good reason they couldn't be turned
into prod flags for, say, java 9?


> You can build an "optimized" JVM though. It's close to product binaries
> w.r.t. speed, but contains also provides most of diagnostic logic (e.g.,
> all nonproduct flags are available).


> If autoboxing is involved, you can try -XX:+AggressiveUnboxing.
>
So I see this is behind UnlockExperimentalVMOptions (I'm on 8u92).  Some of
the instances I'm seeing are, indeed, autoboxing.  Is this feature stable?
What additional optimizations does it enable? Or put another way, why is it
experimental? :)

>
> Best regards,
> Vladimir Ivanov
>

Thanks Vladimir, very helpful.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160908/561423c3/attachment.html>

From doug.simon at oracle.com  Thu Sep  8 13:12:34 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Thu, 8 Sep 2016 15:12:34 +0200
Subject: RFR: 8165434: [JVMCI] remove uses of setAccessible
In-Reply-To: <999A422E-6CF6-45C5-955B-D58745DBB456@twitter.com>
References: <BA873BD3-C571-43B2-B41F-EE9FA35FE926@oracle.com>
	<864558C5-C2AD-4D6B-BB6F-568F00BBE28A@twitter.com>
	<6224CDA0-63E6-442C-BD13-732208FA75A2@oracle.com>
	<999A422E-6CF6-45C5-955B-D58745DBB456@twitter.com>
Message-ID: <21860311-D6E9-482B-B0A0-F488A516A1D3@oracle.com>


> On 07 Sep 2016, at 19:52, Christian Thalinger <cthalinger at twitter.com> wrote:
> 
>> 
>> On Sep 7, 2016, at 2:29 AM, Doug Simon <doug.simon at oracle.com> wrote:
>> 
>>> 
>>> On 06 Sep 2016, at 20:12, Christian Thalinger <cthalinger at twitter.com> wrote:
>>> 
>>> 
>>>> On Sep 5, 2016, at 6:45 AM, Doug Simon <doug.simon at oracle.com> wrote:
>>>> 
>>>> JVMCI currently uses java.lang.reflect.AccessibleObject.setAccessible to get at private internals of certain JDK objects (e.g. java.lang.reflect.Method::slot). In light of changes around java.lang.reflect.AccessibleObject::setAccessible at http://openjdk.java.net/jeps/261, this may require extra command line options at some point. To avoid that, I?ve removed all uses of setAccessible in JVMCI.
>>>> 
>>>> http://cr.openjdk.java.net/~dnsimon/8165434/
>>> 
>>> src/jdk.vm.ci/share/classes/jdk.vm.ci.meta/src/jdk/vm/ci/meta/ModifiersProvider.java
>>> 
>>> +    int BRIDGE = 0x0040;
>>> +    int VARARGS = 0x0080;
>>> +    int SYNTHETIC = 0x1000;
>>> +    int ANNOTATION = 0x2000;
>>> +    int ENUM = 0x4000;
>>> I wish we could avoid that.  We can?t use this stuff because it?s HotSpot-dependent, right?
>>> +        assert ModifiersProvider.SYNTHETIC == getConstant("JVM_ACC_SYNTHETIC", Integer.class);
>>> +        assert ModifiersProvider.ANNOTATION == getConstant("JVM_ACC_ANNOTATION", Integer.class);
>>> +        assert ModifiersProvider.BRIDGE == getConstant("JVM_ACC_BRIDGE", Integer.class);
>>> +        assert ModifiersProvider.VARARGS == getConstant("JVM_ACC_VARARGS", Integer.class);
>>> +        assert ModifiersProvider.ENUM == getConstant("JVM_ACC_ENUM", Integer.class);
>>> What if we convert these constants to interface methods and the VM-dependent part has to implement them?  Or maybe even keep the fields and assign them via interface methods.
>> 
>> Following your suggestion, I?ve factored out these VM dependent flags to a new HotSpotModifiers class:
>> 
>> http://cr.openjdk.java.net/~dnsimon/8165434.v2/
> 
> Excellent.  One question? I noticed HotSpotModifiers is an interface but no other class implements it.  Is there a reason for it being an interface?

Nope. It?s now a class.

> 
> Only nit, remove 2011:
>    2  * Copyright (c) 2011, 2016, Oracle and/or its affiliates. All rights reserved.

Fixed.

-Doug

From vladimir.x.ivanov at oracle.com  Thu Sep  8 13:43:06 2016
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Thu, 8 Sep 2016 16:43:06 +0300
Subject: MaxBCEAEstimateSize and inlining clarification
In-Reply-To: <CAHjP37FE5qJQaT=M7dywGMWS_c35grvhANRzHpqb67D-KT1Wgw@mail.gmail.com>
References: <CAHjP37Fjf878iKxCi-ncuqd3CbGdk8wqfbN+162Edph8ajBPCg@mail.gmail.com>
	<9a143950-c069-7cae-0c52-d16aae4fe8fe@oracle.com>
	<CAHjP37FE5qJQaT=M7dywGMWS_c35grvhANRzHpqb67D-KT1Wgw@mail.gmail.com>
Message-ID: <156e3c2e-238b-bbc1-08c6-358837839f5c@oracle.com>


>     Yes, it's sort of "pseudo inlining". EA happens after inlining is
>     over (both parse & post-parse phases). For calls with known target,
>     EA performs static analysis to compute escape info for arguments. It
>     happens for methods smaller than MaxBCEAEstimateSize.
>     MaxBCEAEstimateLevel limits the inlining depth during analysis.
>
> By "known target", does that take profiling into account or it has to be
> statically known? But basically, it sounds like this is what Roland said
> -- any methods not inlined for whatever reason (not hot enough, too big,
> etc) are also inspected for EA purposes, but with the
> MaxBCEAEstimateSize and Level limits.

Profiling info isn't used at all. At the beginning all calls with known 
targets are already static calls (CallStaticJavaNode in the IR). And 
during analysis only static info (CHA) is used to devirtualize calls.

>     I wish -XX:+PrintEscapeAnalysis & -XX:+PrintEliminateAllocations
>     were available in product binaries, but they aren't, unfortunately.
>
> Yes, that would be great! Is there a good reason they couldn't be turned
> into prod flags for, say, java 9?

It's not that simple, since the flags use dumping logic not available in 
product binaries (e.g., Node::dump() to print corresponing IR nodes).

I don't see a compelling reason not to have all the dumping logic 
available in product binaries, but it's much larger project, comparing 
to changing type for a couple of flags from "nonproduct" to "diagnostic".

>     You can build an "optimized" JVM though. It's close to product
>     binaries w.r.t. speed, but contains also provides most of diagnostic
>     logic (e.g., all nonproduct flags are available).
>
>
>     If autoboxing is involved, you can try -XX:+AggressiveUnboxing.
>
> So I see this is behind UnlockExperimentalVMOptions (I'm on 8u92).  Some
> of the instances I'm seeing are, indeed, autoboxing.  Is this feature
> stable? What additional optimizations does it enable? Or put another
> way, why is it experimental? :)

The approach to box elimination it does is more reliable (operates on 
valueOf calls instead of an inlined method).

At least, we are not aware of any bugs in the implementation. It is 
still experimental because we haven't had time to test it thoroughly yet 
and it went out of our radars after intergration.. Hope to take care of 
it in 9u.

Best regards,
Vladimir Ivanov

From vitalyd at gmail.com  Thu Sep  8 13:54:47 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Thu, 8 Sep 2016 09:54:47 -0400
Subject: MaxBCEAEstimateSize and inlining clarification
In-Reply-To: <156e3c2e-238b-bbc1-08c6-358837839f5c@oracle.com>
References: <CAHjP37Fjf878iKxCi-ncuqd3CbGdk8wqfbN+162Edph8ajBPCg@mail.gmail.com>
	<9a143950-c069-7cae-0c52-d16aae4fe8fe@oracle.com>
	<CAHjP37FE5qJQaT=M7dywGMWS_c35grvhANRzHpqb67D-KT1Wgw@mail.gmail.com>
	<156e3c2e-238b-bbc1-08c6-358837839f5c@oracle.com>
Message-ID: <CAHjP37GxUnaK+2QFmm2zt0CxbMkaZ+AuwZRudLp7VxzmZQFjRg@mail.gmail.com>

On Thu, Sep 8, 2016 at 9:43 AM, Vladimir Ivanov <
vladimir.x.ivanov at oracle.com> wrote:

>
>     Yes, it's sort of "pseudo inlining". EA happens after inlining is
>>     over (both parse & post-parse phases). For calls with known target,
>>     EA performs static analysis to compute escape info for arguments. It
>>     happens for methods smaller than MaxBCEAEstimateSize.
>>     MaxBCEAEstimateLevel limits the inlining depth during analysis.
>>
>> By "known target", does that take profiling into account or it has to be
>> statically known? But basically, it sounds like this is what Roland said
>> -- any methods not inlined for whatever reason (not hot enough, too big,
>> etc) are also inspected for EA purposes, but with the
>> MaxBCEAEstimateSize and Level limits.
>>
>
> Profiling info isn't used at all. At the beginning all calls with known
> targets are already static calls (CallStaticJavaNode in the IR). And during
> analysis only static info (CHA) is used to devirtualize calls.

Thanks.  Does that mean that marking classes/methods final helps here even
if at runtime there're no subclasses? I know marking classes/methods final
removes the need to register guards (by virtue of making the call static
when receiver is final), but does it have added benefit for EA purposes
here as well?

I'm slightly confused by your "only static info (CHA) is used to
devirtualize calls" statement.  Are you referring to the same CHA concept
where loaded class hierarchy is inspected? It sounds like you're not since
you mention "static info", but CHA is dynamic in my mind.  I'm probably
misinterpreting this.

>
>
>     I wish -XX:+PrintEscapeAnalysis & -XX:+PrintEliminateAllocations
>>     were available in product binaries, but they aren't, unfortunately.
>>
>> Yes, that would be great! Is there a good reason they couldn't be turned
>> into prod flags for, say, java 9?
>>
>
> It's not that simple, since the flags use dumping logic not available in
> product binaries (e.g., Node::dump() to print corresponing IR nodes).
>
> I don't see a compelling reason not to have all the dumping logic
> available in product binaries, but it's much larger project, comparing to
> changing type for a couple of flags from "nonproduct" to "diagnostic".

Ok, understood.  I do think this would be very valuable, so if you guys can
make it happen it'd be greatly appreciated.

>
>
>     You can build an "optimized" JVM though. It's close to product
>>     binaries w.r.t. speed, but contains also provides most of diagnostic
>>     logic (e.g., all nonproduct flags are available).
>>
>>
>>     If autoboxing is involved, you can try -XX:+AggressiveUnboxing.
>>
>> So I see this is behind UnlockExperimentalVMOptions (I'm on 8u92).  Some
>> of the instances I'm seeing are, indeed, autoboxing.  Is this feature
>> stable? What additional optimizations does it enable? Or put another
>> way, why is it experimental? :)
>>
>
> The approach to box elimination it does is more reliable (operates on
> valueOf calls instead of an inlined method).
>
> At least, we are not aware of any bugs in the implementation. It is still
> experimental because we haven't had time to test it thoroughly yet and it
> went out of our radars after intergration.. Hope to take care of it in 9u.
>
Ok, thanks.

>
> Best regards,
> Vladimir Ivanov
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160908/6f6a4281/attachment.html>

From filipp.zhinkin at gmail.com  Thu Sep  8 14:14:26 2016
From: filipp.zhinkin at gmail.com (Filipp Zhinkin)
Date: Thu, 8 Sep 2016 17:14:26 +0300
Subject: RFR(M): 8165235: [TESTBUG] RTM tests must check OS version
In-Reply-To: <28e894e35a3a431aa92d05b310b48970@DEWDFE13DE50.global.corp.sap>
References: <f83c726cbcfd4eb596384c4e28f25a3f@DEWDFE13DE50.global.corp.sap>
	<CA+3eh11HfSGncvvWAymfMFqSRNV4VX7+WJpgkxWotJPGuh0jaA@mail.gmail.com>
	<a02689eb808c43a8899fd1578a2c9d00@DEWDFE13DE50.global.corp.sap>
	<CA+3eh13rLhfyHP_LNReFk4HCB4twxA51HU+MEo8JWD_y86KqRQ@mail.gmail.com>
	<CANQc0ndWJC9uWOz0Ytbs=Cf5APDijuSyyk8N7337oGCkk4TvuA@mail.gmail.com>
	<28e894e35a3a431aa92d05b310b48970@DEWDFE13DE50.global.corp.sap>
Message-ID: <CANQc0ncr_xKNTWB7YH2a7dQ+MAZ7JPexOdfL4DALR2a=jzeA2Q@mail.gmail.com>

Hi Goetz,

sorry for the late reply.

The change looks good for me.
Unfortunately I'm not able to sponsor it, because I'm not working at
Oracle and can't submit JPRT.

Regards,
Filipp.

On Tue, Sep 6, 2016 at 4:12 PM, Lindenmaier, Goetz
<goetz.lindenmaier at sap.com> wrote:
> Hi Filipp,
>
> thanks for reviewing my change!
> I fixed the two issues:
> http://cr.openjdk.java.net/~goetz/wr16/8165235-osRecog/03/webrev.bs/
> http://cr.openjdk.java.net/~goetz/wr16/8165235-osRecog/03/webrev.hs/
>
> The hotspot change is unchanged except for the reviewer attribution.
>
> I also fixed the comment in Platform.java: major->minor.
>
> Would you mind sponsoring the change?
>
> Best regards,
>   Goetz.
>
>
>> -----Original Message-----
>> From: Filipp Zhinkin [mailto:filipp.zhinkin at gmail.com]
>> Sent: Dienstag, 6. September 2016 13:46
>> To: Volker Simonis <volker.simonis at gmail.com>
>> Cc: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; hotspot-compiler-
>> dev at openjdk.java.net
>> Subject: Re: RFR(M): 8165235: [TESTBUG] RTM tests must check OS version
>>
>> Hi,
>>
>> I would suggest to use something like Boolean.TRUE::booleanValue
>> instead of null in AndPredicated ctor and use camel case for
>> Platform's fields and methods.
>> Otherwise the change looks good.
>>
>> Just for the record: all those predicates where introduced because
>> there were no way to check OS/CPU/whatever using jtreg.
>> Now it should be possible to skip tests using jreg's @required tag. So
>> maybe we can get rid of some java code? :)
>> // Not suggesting to do it right now.
>>
>> Regards,
>> Filipp.
>>
>> On Tue, Sep 6, 2016 at 1:21 PM, Volker Simonis <volker.simonis at gmail.com>
>> wrote:
>> > Thumbs up from me!
>> >
>> > Volker
>> >
>> > On Tue, Sep 6, 2016 at 11:11 AM, Lindenmaier, Goetz
>> > <goetz.lindenmaier at sap.com> wrote:
>> >> Hi Volker,
>> >>
>> >> thanks for the review! I fixed the two issues:
>> >> http://cr.openjdk.java.net/~goetz/wr16/8165235-
>> osRecog/02/webrev.hs/
>> >> http://cr.openjdk.java.net/~goetz/wr16/8165235-
>> osRecog/02/webrev.bs/
>> >>
>> >> Best regards,
>> >>   Goetz.
>> >>
>> >>
>> >>> -----Original Message-----
>> >>> From: Volker Simonis [mailto:volker.simonis at gmail.com]
>> >>> Sent: Montag, 5. September 2016 14:57
>> >>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>
>> >>> Cc: hotspot-compiler-dev at openjdk.java.net
>> >>> Subject: Re: RFR(M): 8165235: [TESTBUG] RTM tests must check OS
>> version
>> >>>
>> >>> Hi Goetz,
>> >>>
>> >>> I think you've only forgot to import
>> >>> compiler.testlibrary.rtm.predicate.SupportedOS into
>> >>> test/compiler/rtm/cli/TestUseRTMLockingOptionWithBiasedLocking.java
>> >>>
>> >>> Also, in SupportedOS.java the line:
>> >>>
>> >>> public boolean getAsBoolean()
>> >>>
>> >>> is indented to far (should be four spaces less like the annotation in
>> >>> the line before).
>> >>>
>> >>> Besides that, the change looks good.
>> >>>
>> >>> Thanks for fixing this,
>> >>> Volker
>> >>>
>> >>> On Mon, Sep 5, 2016 at 1:54 PM, Lindenmaier, Goetz
>> >>> <goetz.lindenmaier at sap.com> wrote:
>> >>> > Hi,
>> >>> >
>> >>> >
>> >>> >
>> >>> > This fixes the RTM tests wrt. to supported platforms on ppc.
>> >>> >
>> >>> > Please review this change. I please need a sponsor.
>> >>> > http://cr.openjdk.java.net/~goetz/wr16/8165235-
>> osRecog/01/webrev.bs/
>> >>> >
>> >>> > http://cr.openjdk.java.net/~goetz/wr16/8165235-
>> osRecog/01/webrev.hs/
>> >>> >
>> >>> >
>> >>> > RTM uses special instructions that are only available on recent x86
>> cpus. On
>> >>> > x86, this feature does not need OS support. On ppc, the equivalent
>> >>> > functionality, hardware transactional memory, requires OS support.
>> Thus
>> >>> the
>> >>> > feature is only enabled by the VM if CPU and OS are at a specific level.
>> The
>> >>> > tests must check this. too. This holds for AIX and Linux.
>> >>> >
>> >>> >
>> >>> >
>> >>> > To do so, this change introduces rtm/predicate/SupportedOS.java
>> which
>> >>> checks
>> >>> > for proper OS versions on ppc, else returns true.
>> >>> >
>> >>> > The OS version is retrieved from Platform.java, which has new
>> methods
>> >>> > getOsVersionMajor() and  getOsVersionMinor().
>> >>> >
>> >>> > To simplify the checks in the tests, I also introduced a 3-way
>> AndPredicate
>> >>> > constructor.
>> >>> >
>> >>> >
>> >>> >
>> >>> > To simplify the OS version check on Aix, I change enabling RTM on Aix
>> to
>> >>> > require AIX 7.2.
>> >>> >
>> >>> > Before, it was enabled on AIX 7.1.3.30, which contains an important
>> bug fix.
>> >>> > The
>> >>> >
>> >>> > last digits of this version are not exported to os.version property, so I
>> >>> > can not
>> >>> >
>> >>> > check for them in the test.
>> >>> >
>> >>> >
>> >>> >
>> >>> > Best regards,
>> >>> >
>> >>> >   Goetz.

From goetz.lindenmaier at sap.com  Thu Sep  8 14:36:00 2016
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Thu, 8 Sep 2016 14:36:00 +0000
Subject: RFR(M): 8165235: [TESTBUG] RTM tests must check OS version
In-Reply-To: <CANQc0ncr_xKNTWB7YH2a7dQ+MAZ7JPexOdfL4DALR2a=jzeA2Q@mail.gmail.com>
References: <f83c726cbcfd4eb596384c4e28f25a3f@DEWDFE13DE50.global.corp.sap>
	<CA+3eh11HfSGncvvWAymfMFqSRNV4VX7+WJpgkxWotJPGuh0jaA@mail.gmail.com>
	<a02689eb808c43a8899fd1578a2c9d00@DEWDFE13DE50.global.corp.sap>
	<CA+3eh13rLhfyHP_LNReFk4HCB4twxA51HU+MEo8JWD_y86KqRQ@mail.gmail.com>
	<CANQc0ndWJC9uWOz0Ytbs=Cf5APDijuSyyk8N7337oGCkk4TvuA@mail.gmail.com>
	<28e894e35a3a431aa92d05b310b48970@DEWDFE13DE50.global.corp.sap>
	<CANQc0ncr_xKNTWB7YH2a7dQ+MAZ7JPexOdfL4DALR2a=jzeA2Q@mail.gmail.com>
Message-ID: <a370adf5f203495487352f463fb290d1@DEWDFE13DE50.global.corp.sap>

Hi Fillipp, 

Oh, I understand.  Thanks for reviewing anyways!

Best regards,
  Goetz.

> -----Original Message-----
> From: Filipp Zhinkin [mailto:filipp.zhinkin at gmail.com]
> Sent: Donnerstag, 8. September 2016 16:14
> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>
> Cc: Volker Simonis <volker.simonis at gmail.com>; hotspot-compiler-
> dev at openjdk.java.net
> Subject: Re: RFR(M): 8165235: [TESTBUG] RTM tests must check OS version
> 
> Hi Goetz,
> 
> sorry for the late reply.
> 
> The change looks good for me.
> Unfortunately I'm not able to sponsor it, because I'm not working at
> Oracle and can't submit JPRT.
> 
> Regards,
> Filipp.
> 
> On Tue, Sep 6, 2016 at 4:12 PM, Lindenmaier, Goetz
> <goetz.lindenmaier at sap.com> wrote:
> > Hi Filipp,
> >
> > thanks for reviewing my change!
> > I fixed the two issues:
> > http://cr.openjdk.java.net/~goetz/wr16/8165235-osRecog/03/webrev.bs/
> > http://cr.openjdk.java.net/~goetz/wr16/8165235-osRecog/03/webrev.hs/
> >
> > The hotspot change is unchanged except for the reviewer attribution.
> >
> > I also fixed the comment in Platform.java: major->minor.
> >
> > Would you mind sponsoring the change?
> >
> > Best regards,
> >   Goetz.
> >
> >
> >> -----Original Message-----
> >> From: Filipp Zhinkin [mailto:filipp.zhinkin at gmail.com]
> >> Sent: Dienstag, 6. September 2016 13:46
> >> To: Volker Simonis <volker.simonis at gmail.com>
> >> Cc: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; hotspot-
> compiler-
> >> dev at openjdk.java.net
> >> Subject: Re: RFR(M): 8165235: [TESTBUG] RTM tests must check OS
> version
> >>
> >> Hi,
> >>
> >> I would suggest to use something like Boolean.TRUE::booleanValue
> >> instead of null in AndPredicated ctor and use camel case for
> >> Platform's fields and methods.
> >> Otherwise the change looks good.
> >>
> >> Just for the record: all those predicates where introduced because
> >> there were no way to check OS/CPU/whatever using jtreg.
> >> Now it should be possible to skip tests using jreg's @required tag. So
> >> maybe we can get rid of some java code? :)
> >> // Not suggesting to do it right now.
> >>
> >> Regards,
> >> Filipp.
> >>
> >> On Tue, Sep 6, 2016 at 1:21 PM, Volker Simonis
> <volker.simonis at gmail.com>
> >> wrote:
> >> > Thumbs up from me!
> >> >
> >> > Volker
> >> >
> >> > On Tue, Sep 6, 2016 at 11:11 AM, Lindenmaier, Goetz
> >> > <goetz.lindenmaier at sap.com> wrote:
> >> >> Hi Volker,
> >> >>
> >> >> thanks for the review! I fixed the two issues:
> >> >> http://cr.openjdk.java.net/~goetz/wr16/8165235-
> >> osRecog/02/webrev.hs/
> >> >> http://cr.openjdk.java.net/~goetz/wr16/8165235-
> >> osRecog/02/webrev.bs/
> >> >>
> >> >> Best regards,
> >> >>   Goetz.
> >> >>
> >> >>
> >> >>> -----Original Message-----
> >> >>> From: Volker Simonis [mailto:volker.simonis at gmail.com]
> >> >>> Sent: Montag, 5. September 2016 14:57
> >> >>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>
> >> >>> Cc: hotspot-compiler-dev at openjdk.java.net
> >> >>> Subject: Re: RFR(M): 8165235: [TESTBUG] RTM tests must check OS
> >> version
> >> >>>
> >> >>> Hi Goetz,
> >> >>>
> >> >>> I think you've only forgot to import
> >> >>> compiler.testlibrary.rtm.predicate.SupportedOS into
> >> >>>
> test/compiler/rtm/cli/TestUseRTMLockingOptionWithBiasedLocking.java
> >> >>>
> >> >>> Also, in SupportedOS.java the line:
> >> >>>
> >> >>> public boolean getAsBoolean()
> >> >>>
> >> >>> is indented to far (should be four spaces less like the annotation in
> >> >>> the line before).
> >> >>>
> >> >>> Besides that, the change looks good.
> >> >>>
> >> >>> Thanks for fixing this,
> >> >>> Volker
> >> >>>
> >> >>> On Mon, Sep 5, 2016 at 1:54 PM, Lindenmaier, Goetz
> >> >>> <goetz.lindenmaier at sap.com> wrote:
> >> >>> > Hi,
> >> >>> >
> >> >>> >
> >> >>> >
> >> >>> > This fixes the RTM tests wrt. to supported platforms on ppc.
> >> >>> >
> >> >>> > Please review this change. I please need a sponsor.
> >> >>> > http://cr.openjdk.java.net/~goetz/wr16/8165235-
> >> osRecog/01/webrev.bs/
> >> >>> >
> >> >>> > http://cr.openjdk.java.net/~goetz/wr16/8165235-
> >> osRecog/01/webrev.hs/
> >> >>> >
> >> >>> >
> >> >>> > RTM uses special instructions that are only available on recent x86
> >> cpus. On
> >> >>> > x86, this feature does not need OS support. On ppc, the equivalent
> >> >>> > functionality, hardware transactional memory, requires OS support.
> >> Thus
> >> >>> the
> >> >>> > feature is only enabled by the VM if CPU and OS are at a specific
> level.
> >> The
> >> >>> > tests must check this. too. This holds for AIX and Linux.
> >> >>> >
> >> >>> >
> >> >>> >
> >> >>> > To do so, this change introduces rtm/predicate/SupportedOS.java
> >> which
> >> >>> checks
> >> >>> > for proper OS versions on ppc, else returns true.
> >> >>> >
> >> >>> > The OS version is retrieved from Platform.java, which has new
> >> methods
> >> >>> > getOsVersionMajor() and  getOsVersionMinor().
> >> >>> >
> >> >>> > To simplify the checks in the tests, I also introduced a 3-way
> >> AndPredicate
> >> >>> > constructor.
> >> >>> >
> >> >>> >
> >> >>> >
> >> >>> > To simplify the OS version check on Aix, I change enabling RTM on
> Aix
> >> to
> >> >>> > require AIX 7.2.
> >> >>> >
> >> >>> > Before, it was enabled on AIX 7.1.3.30, which contains an important
> >> bug fix.
> >> >>> > The
> >> >>> >
> >> >>> > last digits of this version are not exported to os.version property, so
> I
> >> >>> > can not
> >> >>> >
> >> >>> > check for them in the test.
> >> >>> >
> >> >>> >
> >> >>> >
> >> >>> > Best regards,
> >> >>> >
> >> >>> >   Goetz.

From goetz.lindenmaier at sap.com  Thu Sep  8 14:38:44 2016
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Thu, 8 Sep 2016 14:38:44 +0000
Subject: please sponsor? RFR(M): 8165235: [TESTBUG] RTM tests must check OS
	version
Message-ID: <d509c4eeb57949c88f4b3df0b6f35409@DEWDFE13DE50.global.corp.sap>

Hi,

This change was reviewed by Volker Simonis and Fillipp Zhinkin.
Final webrevs: 
http://cr.openjdk.java.net/~goetz/wr16/8165235-osRecog/03/webrev.bs/
http://cr.openjdk.java.net/~goetz/wr16/8165235-osRecog/03/webrev.hs/

Could someone please sponsor?

Thanks!
  Goetz

> -----Original Message-----
> From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-
> bounces at openjdk.java.net] On Behalf Of Lindenmaier, Goetz
> Sent: Montag, 5. September 2016 13:55
> To: hotspot-compiler-dev at openjdk.java.net
> Subject: RFR(M): 8165235: [TESTBUG] RTM tests must check OS version
> 
> Hi,
> 
> 
> 
> This fixes the RTM tests wrt. to supported platforms on ppc.
> 
> Please review this change. I please need a sponsor.
> http://cr.openjdk.java.net/~goetz/wr16/8165235-osRecog/01/webrev.bs/
> 
> http://cr.openjdk.java.net/~goetz/wr16/8165235-osRecog/01/webrev.hs/
> 
> 
> RTM uses special instructions that are only available on recent x86 cpus. On
> x86, this feature does not need OS support. On ppc, the equivalent
> functionality, hardware transactional memory, requires OS support. Thus the
> feature is only enabled by the VM if CPU and OS are at a specific level. The
> tests must check this. too. This holds for AIX and Linux.
> 
> 
> 
> To do so, this change introduces rtm/predicate/SupportedOS.java which
> checks for proper OS versions on ppc, else returns true.
> 
> The OS version is retrieved from Platform.java, which has new methods
> getOsVersionMajor() and  getOsVersionMinor().
> 
> To simplify the checks in the tests, I also introduced a 3-way AndPredicate
> constructor.
> 
> 
> 
> To simplify the OS version check on Aix, I change enabling RTM on Aix to
> require AIX 7.2.
> 
> Before, it was enabled on AIX 7.1.3.30, which contains an important bug fix.
> The
> 
> last digits of this version are not exported to os.version property, so I can not
> 
> check for them in the test.
> 
> 
> 
> Best regards,
> 
>   Goetz.


From dmitrij.pochepko at oracle.com  Thu Sep  8 14:48:00 2016
From: dmitrij.pochepko at oracle.com (Dmitrij Pochepko)
Date: Thu, 8 Sep 2016 17:48:00 +0300
Subject: RFR: 8155219 - [TESTBUG] Rewrite compiler/ciReplay/TestVM.sh in java
Message-ID: <a764d97b-7015-d87c-a7c8-3031e1f2762e@oracle.com>

Hi,

please review fix for 8155219 - [TESTBUG] Rewrite 
compiler/ciReplay/TestVM.sh in java

compiler/ciReoplay/* tests were ported from shell to java.


CR: https://bugs.openjdk.java.net/browse/JDK-8155219

webrev for root level: 
http://cr.openjdk.java.net/~dpochepk/8155219/webrev.root.01/

webrev for hotspot: http://cr.openjdk.java.net/~dpochepk/8155219/webrev.01/


I've tested it via rbt.

Thanks,

Dmitrij


From vladimir.x.ivanov at oracle.com  Thu Sep  8 15:53:11 2016
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Thu, 8 Sep 2016 18:53:11 +0300
Subject: MaxBCEAEstimateSize and inlining clarification
In-Reply-To: <CAHjP37GxUnaK+2QFmm2zt0CxbMkaZ+AuwZRudLp7VxzmZQFjRg@mail.gmail.com>
References: <CAHjP37Fjf878iKxCi-ncuqd3CbGdk8wqfbN+162Edph8ajBPCg@mail.gmail.com>
	<9a143950-c069-7cae-0c52-d16aae4fe8fe@oracle.com>
	<CAHjP37FE5qJQaT=M7dywGMWS_c35grvhANRzHpqb67D-KT1Wgw@mail.gmail.com>
	<156e3c2e-238b-bbc1-08c6-358837839f5c@oracle.com>
	<CAHjP37GxUnaK+2QFmm2zt0CxbMkaZ+AuwZRudLp7VxzmZQFjRg@mail.gmail.com>
Message-ID: <ec5068d4-4fc7-3b63-2089-e48cbcca9eb3@oracle.com>


>     Profiling info isn't used at all. At the beginning all calls with
>     known targets are already static calls (CallStaticJavaNode in the
>     IR). And during analysis only static info (CHA) is used to
>     devirtualize calls.
>
> Thanks.  Does that mean that marking classes/methods final helps here
> even if at runtime there're no subclasses? I know marking
> classes/methods final removes the need to register guards (by virtue of
> making the call static when receiver is final), but does it have added
> benefit for EA purposes here as well?

Marking classes/methods final can only reduce number of dependencies 
associated with a method, produced by CHA, but it doesn't give any new 
information to the analysis itself, so shouldn't affect inlining decisions.

> I'm slightly confused by your "only static info (CHA) is used to
> devirtualize calls" statement.  Are you referring to the same CHA
> concept where loaded class hierarchy is inspected? It sounds like you're
> not since you mention "static info", but CHA is dynamic in my mind.  I'm
> probably misinterpreting this.

Yes, sorry for the confusion. That's the same concept which is used 
during ordinary inlining: class hierarchy inspection and nmethod 
dependencies to trach changes.

>     It's not that simple, since the flags use dumping logic not
>     available in product binaries (e.g., Node::dump() to print
>     corresponing IR nodes).
>
>     I don't see a compelling reason not to have all the dumping logic
>     available in product binaries, but it's much larger project,
>     comparing to changing type for a couple of flags from "nonproduct"
>     to "diagnostic".
>
> Ok, understood.  I do think this would be very valuable, so if you guys
> can make it happen it'd be greatly appreciated.

Filed JDK-8165716 [1].

Best regards,
Vladimir Ivanov

[1] https://bugs.openjdk.java.net/browse/JDK-8165716

From rednaxelafx at gmail.com  Thu Sep  8 16:14:49 2016
From: rednaxelafx at gmail.com (Krystal Mok)
Date: Thu, 8 Sep 2016 09:14:49 -0700
Subject: MaxBCEAEstimateSize and inlining clarification
In-Reply-To: <CAHjP37GxUnaK+2QFmm2zt0CxbMkaZ+AuwZRudLp7VxzmZQFjRg@mail.gmail.com>
References: <CAHjP37Fjf878iKxCi-ncuqd3CbGdk8wqfbN+162Edph8ajBPCg@mail.gmail.com>
	<9a143950-c069-7cae-0c52-d16aae4fe8fe@oracle.com>
	<CAHjP37FE5qJQaT=M7dywGMWS_c35grvhANRzHpqb67D-KT1Wgw@mail.gmail.com>
	<156e3c2e-238b-bbc1-08c6-358837839f5c@oracle.com>
	<CAHjP37GxUnaK+2QFmm2zt0CxbMkaZ+AuwZRudLp7VxzmZQFjRg@mail.gmail.com>
Message-ID: <CA+cQ+tSbTfHreST3ouVcvvi8Np5YeQh4CG5toMmV1Fx-2a60_g@mail.gmail.com>

Hi Vitaly,

On Thu, Sep 8, 2016 at 6:54 AM, Vitaly Davidovich <vitalyd at gmail.com> wrote:
>
> I'm slightly confused by your "only static info (CHA) is used to
> devirtualize calls" statement.  Are you referring to the same CHA concept
> where loaded class hierarchy is inspected? It sounds like you're not since
> you mention "static info", but CHA is dynamic in my mind.  I'm probably
> misinterpreting this.
>

One general rule of thumb: when you see JIT people talking about "static
info" (e.g. statically resolvable target), that means values that are know
at JIT-compile time. Or simply, at compile time. From the JIT compilers'
point of view, CHA information is considered "static" with dependencies.

Speaking of "statically known", I'd like to make a side note about static
finals. Assuming we trust static finals will not be changed after first
assignment (excluding outliers like System.in, System.out, System.err),
then there's an interesting difference between what a JIT compiler consider
as "static constant" than javac.
In Java, "final" is really overloaded with two different meanings on the
language level: "const" and "readonly" (using C#'s terminology) -- "const"
for javac-level compile-time constants, and "readonly" for values that are
initialized at runtime, but stay immutable after initialization.
javac implements the Java Language Spec, and only treats "const" usage as
constants. On the other hand, to a JIT compiler, both "const" and
"readonly" usages would be considered as static constants, because the
value is known at JIT-compile time and won't change afterwards.

Just my two cents ;-)

- Kris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160908/fae5af1e/attachment.html>

From vitalyd at gmail.com  Thu Sep  8 16:27:55 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Thu, 8 Sep 2016 12:27:55 -0400
Subject: MaxBCEAEstimateSize and inlining clarification
In-Reply-To: <CA+cQ+tSbTfHreST3ouVcvvi8Np5YeQh4CG5toMmV1Fx-2a60_g@mail.gmail.com>
References: <CAHjP37Fjf878iKxCi-ncuqd3CbGdk8wqfbN+162Edph8ajBPCg@mail.gmail.com>
	<9a143950-c069-7cae-0c52-d16aae4fe8fe@oracle.com>
	<CAHjP37FE5qJQaT=M7dywGMWS_c35grvhANRzHpqb67D-KT1Wgw@mail.gmail.com>
	<156e3c2e-238b-bbc1-08c6-358837839f5c@oracle.com>
	<CAHjP37GxUnaK+2QFmm2zt0CxbMkaZ+AuwZRudLp7VxzmZQFjRg@mail.gmail.com>
	<CA+cQ+tSbTfHreST3ouVcvvi8Np5YeQh4CG5toMmV1Fx-2a60_g@mail.gmail.com>
Message-ID: <CAHjP37GH2LRCFNCsnCBZoYXq_VDGFhChHCg0CH5OYLLKXAgZ9A@mail.gmail.com>

Hi Kris,

On Thu, Sep 8, 2016 at 12:14 PM, Krystal Mok <rednaxelafx at gmail.com> wrote:

> Hi Vitaly,
>
> On Thu, Sep 8, 2016 at 6:54 AM, Vitaly Davidovich <vitalyd at gmail.com>
> wrote:
>>
>> I'm slightly confused by your "only static info (CHA) is used to
>> devirtualize calls" statement.  Are you referring to the same CHA concept
>> where loaded class hierarchy is inspected? It sounds like you're not since
>> you mention "static info", but CHA is dynamic in my mind.  I'm probably
>> misinterpreting this.
>>
>
> One general rule of thumb: when you see JIT people talking about "static
> info" (e.g. statically resolvable target), that means values that are know
> at JIT-compile time. Or simply, at compile time. From the JIT compilers'
> point of view, CHA information is considered "static" with dependencies.
>
Right.  I just wanted to make sure that was the case here.  "JIT-time
static" would prevent confusion :).

>
> Speaking of "statically known", I'd like to make a side note about static
> finals. Assuming we trust static finals will not be changed after first
> assignment (excluding outliers like System.in, System.out, System.err),
> then there's an interesting difference between what a JIT compiler consider
> as "static constant" than javac.
> In Java, "final" is really overloaded with two different meanings on the
> language level: "const" and "readonly" (using C#'s terminology) -- "const"
> for javac-level compile-time constants, and "readonly" for values that are
> initialized at runtime, but stay immutable after initialization.
> javac implements the Java Language Spec, and only treats "const" usage as
> constants. On the other hand, to a JIT compiler, both "const" and
> "readonly" usages would be considered as static constants, because the
> value is known at JIT-compile time and won't change afterwards.
>
Yes.  I make use of that quite a bit to make javac-time dynamic expressions
be JIT-time constants.  However, I hope the whole trusting final instance
fields stuff happens soon.  Otherwise, seemingly const-foldable code like
this:

static long makeMask() {
    return (1L << SomeEnum.A.ordinal()) | (1L << SomeEnum.B.ordinal()); //
etc
}

isn't folded when makeMask is compiled.  But make that mask a static final:

static final long MASK = // same expression as above

and we're good.


> Just my two cents ;-)
>
> - Kris
>

Anyway, we're going off on a tangent here, but thanks for the thoughts
Kris.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160908/ab2db77d/attachment-0001.html>

From vitalyd at gmail.com  Thu Sep  8 16:32:04 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Thu, 8 Sep 2016 12:32:04 -0400
Subject: MaxBCEAEstimateSize and inlining clarification
In-Reply-To: <ec5068d4-4fc7-3b63-2089-e48cbcca9eb3@oracle.com>
References: <CAHjP37Fjf878iKxCi-ncuqd3CbGdk8wqfbN+162Edph8ajBPCg@mail.gmail.com>
	<9a143950-c069-7cae-0c52-d16aae4fe8fe@oracle.com>
	<CAHjP37FE5qJQaT=M7dywGMWS_c35grvhANRzHpqb67D-KT1Wgw@mail.gmail.com>
	<156e3c2e-238b-bbc1-08c6-358837839f5c@oracle.com>
	<CAHjP37GxUnaK+2QFmm2zt0CxbMkaZ+AuwZRudLp7VxzmZQFjRg@mail.gmail.com>
	<ec5068d4-4fc7-3b63-2089-e48cbcca9eb3@oracle.com>
Message-ID: <CAHjP37F1b5oFJEqaq8=Kp0TF-GPy_DJ_vhDJH2yq5DOuan=6sg@mail.gmail.com>

On Thu, Sep 8, 2016 at 11:53 AM, Vladimir Ivanov <
vladimir.x.ivanov at oracle.com> wrote:

>
>     Profiling info isn't used at all. At the beginning all calls with
>>     known targets are already static calls (CallStaticJavaNode in the
>>     IR). And during analysis only static info (CHA) is used to
>>     devirtualize calls.
>>
>> Thanks.  Does that mean that marking classes/methods final helps here
>> even if at runtime there're no subclasses? I know marking
>> classes/methods final removes the need to register guards (by virtue of
>> making the call static when receiver is final), but does it have added
>> benefit for EA purposes here as well?
>>
>
> Marking classes/methods final can only reduce number of dependencies
> associated with a method, produced by CHA, but it doesn't give any new
> information to the analysis itself, so shouldn't affect inlining decisions.

Right.  We're on the same page now with respect to "static info (CHA)" :).

By the way, and this is off-topic to this thread (apologies), but while
we're discussing marking classes/methods final, are there any other
footprint advantages to doing it even if CHA will devirt calls properly? So
removing the need to register dependencies is one, and is good.  Are the
vtables smaller for these cases? Anything else that's an added benefit
(from JVM runtime standpoint)?

>
>
> I'm slightly confused by your "only static info (CHA) is used to
>> devirtualize calls" statement.  Are you referring to the same CHA
>> concept where loaded class hierarchy is inspected? It sounds like you're
>> not since you mention "static info", but CHA is dynamic in my mind.  I'm
>> probably misinterpreting this.
>>
>
> Yes, sorry for the confusion. That's the same concept which is used during
> ordinary inlining: class hierarchy inspection and nmethod dependencies to
> trach changes.

Yup, all good - thanks.

>
>
>     It's not that simple, since the flags use dumping logic not
>>     available in product binaries (e.g., Node::dump() to print
>>     corresponing IR nodes).
>>
>>     I don't see a compelling reason not to have all the dumping logic
>>     available in product binaries, but it's much larger project,
>>     comparing to changing type for a couple of flags from "nonproduct"
>>     to "diagnostic".
>>
>> Ok, understood.  I do think this would be very valuable, so if you guys
>> can make it happen it'd be greatly appreciated.
>>
>
> Filed JDK-8165716 [1].
>
Thank you!

>
> Best regards,
> Vladimir Ivanov
>
> [1] https://bugs.openjdk.java.net/browse/JDK-8165716
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160908/dc2fca30/attachment.html>

From rednaxelafx at gmail.com  Thu Sep  8 21:48:11 2016
From: rednaxelafx at gmail.com (Krystal Mok)
Date: Thu, 8 Sep 2016 14:48:11 -0700
Subject: MaxBCEAEstimateSize and inlining clarification
In-Reply-To: <CAHjP37F1b5oFJEqaq8=Kp0TF-GPy_DJ_vhDJH2yq5DOuan=6sg@mail.gmail.com>
References: <CAHjP37Fjf878iKxCi-ncuqd3CbGdk8wqfbN+162Edph8ajBPCg@mail.gmail.com>
	<9a143950-c069-7cae-0c52-d16aae4fe8fe@oracle.com>
	<CAHjP37FE5qJQaT=M7dywGMWS_c35grvhANRzHpqb67D-KT1Wgw@mail.gmail.com>
	<156e3c2e-238b-bbc1-08c6-358837839f5c@oracle.com>
	<CAHjP37GxUnaK+2QFmm2zt0CxbMkaZ+AuwZRudLp7VxzmZQFjRg@mail.gmail.com>
	<ec5068d4-4fc7-3b63-2089-e48cbcca9eb3@oracle.com>
	<CAHjP37F1b5oFJEqaq8=Kp0TF-GPy_DJ_vhDJH2yq5DOuan=6sg@mail.gmail.com>
Message-ID: <CA+cQ+tQPthnKPe4SucEpyA5izaR77M284ecoDwTCXcZ14pA-8g@mail.gmail.com>

On Thu, Sep 8, 2016 at 9:32 AM, Vitaly Davidovich <vitalyd at gmail.com> wrote:

>
> By the way, and this is off-topic to this thread (apologies), but while
> we're discussing marking classes/methods final, are there any other
> footprint advantages to doing it even if CHA will devirt calls properly? So
> removing the need to register dependencies is one, and is good.  Are the
> vtables smaller for these cases? Anything else that's an added benefit
> (from JVM runtime standpoint)?
>

 Well...nothing that really stands out.

Removing the need for registering the dependencies is certainly a good
thing, but it doesn't really matter that much.

The vtable won't be necessarily be smaller, it depends. What's guaranteed
is that a final method won't need a *new* vtable entry.
Because "final" can be labeled on a method that's virtual in some base
class, and is only "final" on some derived class. That vtable slot in the
derived class is going to be inherited from the base class and then set to
the overriding target, so no saving at all in this case.

bool klassVtable::needs_new_vtable_entry(methodHandle target_method,
                                         Klass* super,
                                         Handle classloader,
                                         Symbol* classname,
                                         AccessFlags class_flags,
                                         TRAPS) {
  // ...

  if (target_method->is_final_method(class_flags) ||
      // a final method never needs a new entry; final methods can be
statically
      // resolved and they have to be present in the vtable only if they
override
      // a super's method, in which case they re-use its entry
      (target_method()->is_static()) ||
      // static methods don't need to be in vtable
      (target_method()->name() ==  vmSymbols::object_initializer_name())
      // <init> is never called dynamically-bound
      ) {
    return false;
  }

  // ...
}

The only thing that I can think of that improves *interpreter* performance
is the invoke_vfinal HotSpot internal bytecode. It allows the interpreter
in HotSpot to skip the vtable lookup and directly dispatch to the target
method, even when the original Java bytecode was invokevirtual. But it's
only an optimization for the interpreter, and it doesn't matter for the JIT
compilers.

- Kris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160908/8d2a5808/attachment.html>

From john.r.rose at oracle.com  Thu Sep  8 22:13:12 2016
From: john.r.rose at oracle.com (John Rose)
Date: Thu, 8 Sep 2016 15:13:12 -0700
Subject: MaxBCEAEstimateSize and inlining clarification
In-Reply-To: <CA+cQ+tQPthnKPe4SucEpyA5izaR77M284ecoDwTCXcZ14pA-8g@mail.gmail.com>
References: <CAHjP37Fjf878iKxCi-ncuqd3CbGdk8wqfbN+162Edph8ajBPCg@mail.gmail.com>
	<9a143950-c069-7cae-0c52-d16aae4fe8fe@oracle.com>
	<CAHjP37FE5qJQaT=M7dywGMWS_c35grvhANRzHpqb67D-KT1Wgw@mail.gmail.com>
	<156e3c2e-238b-bbc1-08c6-358837839f5c@oracle.com>
	<CAHjP37GxUnaK+2QFmm2zt0CxbMkaZ+AuwZRudLp7VxzmZQFjRg@mail.gmail.com>
	<ec5068d4-4fc7-3b63-2089-e48cbcca9eb3@oracle.com>
	<CAHjP37F1b5oFJEqaq8=Kp0TF-GPy_DJ_vhDJH2yq5DOuan=6sg@mail.gmail.com>
	<CA+cQ+tQPthnKPe4SucEpyA5izaR77M284ecoDwTCXcZ14pA-8g@mail.gmail.com>
Message-ID: <80D3ACAD-7751-4DF9-AFEE-B3AFEC420210@oracle.com>

On Sep 8, 2016, at 2:48 PM, Krystal Mok <rednaxelafx at gmail.com> wrote:
> 
> On Thu, Sep 8, 2016 at 9:32 AM, Vitaly Davidovich <vitalyd at gmail.com <mailto:vitalyd at gmail.com>> wrote:
> 
> By the way, and this is off-topic to this thread (apologies), but while we're discussing marking classes/methods final, are there any other footprint advantages to doing it even if CHA will devirt calls properly? So removing the need to register dependencies is one, and is good.  Are the vtables smaller for these cases? Anything else that's an added benefit (from JVM runtime standpoint)? 
> 
>  Well...nothing that really stands out.
> 
> Removing the need for registering the dependencies is certainly a good thing, but it doesn't really matter that much.
> 
> The vtable won't be necessarily be smaller, it depends. What's guaranteed is that a final method won't need a *new* vtable entry.
> Because "final" can be labeled on a method that's virtual in some base class, and is only "final" on some derived class. That vtable slot in the derived class is going to be inherited from the base class and then set to the overriding target, so no saving at all in this case.

HotSpot is overly generous with v-table entries.  IIRC even privates get their own entries, for convoluted reasons.

? John
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160908/62cede1e/attachment-0001.html>

From vitalyd at gmail.com  Thu Sep  8 22:13:31 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Thu, 8 Sep 2016 18:13:31 -0400
Subject: MaxBCEAEstimateSize and inlining clarification
In-Reply-To: <CA+cQ+tQPthnKPe4SucEpyA5izaR77M284ecoDwTCXcZ14pA-8g@mail.gmail.com>
References: <CAHjP37Fjf878iKxCi-ncuqd3CbGdk8wqfbN+162Edph8ajBPCg@mail.gmail.com>
	<9a143950-c069-7cae-0c52-d16aae4fe8fe@oracle.com>
	<CAHjP37FE5qJQaT=M7dywGMWS_c35grvhANRzHpqb67D-KT1Wgw@mail.gmail.com>
	<156e3c2e-238b-bbc1-08c6-358837839f5c@oracle.com>
	<CAHjP37GxUnaK+2QFmm2zt0CxbMkaZ+AuwZRudLp7VxzmZQFjRg@mail.gmail.com>
	<ec5068d4-4fc7-3b63-2089-e48cbcca9eb3@oracle.com>
	<CAHjP37F1b5oFJEqaq8=Kp0TF-GPy_DJ_vhDJH2yq5DOuan=6sg@mail.gmail.com>
	<CA+cQ+tQPthnKPe4SucEpyA5izaR77M284ecoDwTCXcZ14pA-8g@mail.gmail.com>
Message-ID: <CAHjP37EOcMDafeErOMnXsAV_p8gGiB6Fy7i=CqtepEWWAPb-ew@mail.gmail.com>

On Thursday, September 8, 2016, Krystal Mok <rednaxelafx at gmail.com> wrote:

> On Thu, Sep 8, 2016 at 9:32 AM, Vitaly Davidovich <vitalyd at gmail.com
> <javascript:_e(%7B%7D,'cvml','vitalyd at gmail.com');>> wrote:
>
>>
>> By the way, and this is off-topic to this thread (apologies), but while
>> we're discussing marking classes/methods final, are there any other
>> footprint advantages to doing it even if CHA will devirt calls properly? So
>> removing the need to register dependencies is one, and is good.  Are the
>> vtables smaller for these cases? Anything else that's an added benefit
>> (from JVM runtime standpoint)?
>>
>
>  Well...nothing that really stands out.
>
> Removing the need for registering the dependencies is certainly a good
> thing, but it doesn't really matter that much.
>
I'll take it :).  I'm assuming you think it doesn't really matter because
it's only done for C2 compiled code (or C1 as well?), and that's not an
excessive number and this is only checked at class loading time which also
shouldn't happen much (if at all) once steady state is reached.  Or is
there something else/more to your reasoning?

>
> The vtable won't be necessarily be smaller, it depends. What's guaranteed
> is that a final method won't need a *new* vtable entry.
>
Yes, I meant for classes that declare the method, not inherit it.

> Because "final" can be labeled on a method that's virtual in some base
> class, and is only "final" on some derived class. That vtable slot in the
> derived class is going to be inherited from the base class and then set to
> the overriding target, so no saving at all in this case.
>
> bool klassVtable::needs_new_vtable_entry(methodHandle target_method,
>                                          Klass* super,
>                                          Handle classloader,
>                                          Symbol* classname,
>                                          AccessFlags class_flags,
>                                          TRAPS) {
>   // ...
>
>   if (target_method->is_final_method(class_flags) ||
>       // a final method never needs a new entry; final methods can be
> statically
>       // resolved and they have to be present in the vtable only if they
> override
>       // a super's method, in which case they re-use its entry
>       (target_method()->is_static()) ||
>       // static methods don't need to be in vtable
>       (target_method()->name() ==  vmSymbols::object_initializer_name())
>       // <init> is never called dynamically-bound
>       ) {
>     return false;
>   }
>
>   // ...
> }
>
Thanks for the code pointer.

>
> The only thing that I can think of that improves *interpreter* performance
> is the invoke_vfinal HotSpot internal bytecode. It allows the interpreter
> in HotSpot to skip the vtable lookup and directly dispatch to the target
> method, even when the original Java bytecode was invokevirtual. But it's
> only an optimization for the interpreter, and it doesn't matter for the JIT
> compilers.
>
Yeah, don't care too much about interpreter :)

>
> - Kris
>
>
>

-- 
Sent from my phone
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160908/da32c7dc/attachment.html>

From rednaxelafx at gmail.com  Thu Sep  8 23:28:29 2016
From: rednaxelafx at gmail.com (Krystal Mok)
Date: Thu, 8 Sep 2016 16:28:29 -0700
Subject: MaxBCEAEstimateSize and inlining clarification
In-Reply-To: <80D3ACAD-7751-4DF9-AFEE-B3AFEC420210@oracle.com>
References: <CAHjP37Fjf878iKxCi-ncuqd3CbGdk8wqfbN+162Edph8ajBPCg@mail.gmail.com>
	<9a143950-c069-7cae-0c52-d16aae4fe8fe@oracle.com>
	<CAHjP37FE5qJQaT=M7dywGMWS_c35grvhANRzHpqb67D-KT1Wgw@mail.gmail.com>
	<156e3c2e-238b-bbc1-08c6-358837839f5c@oracle.com>
	<CAHjP37GxUnaK+2QFmm2zt0CxbMkaZ+AuwZRudLp7VxzmZQFjRg@mail.gmail.com>
	<ec5068d4-4fc7-3b63-2089-e48cbcca9eb3@oracle.com>
	<CAHjP37F1b5oFJEqaq8=Kp0TF-GPy_DJ_vhDJH2yq5DOuan=6sg@mail.gmail.com>
	<CA+cQ+tQPthnKPe4SucEpyA5izaR77M284ecoDwTCXcZ14pA-8g@mail.gmail.com>
	<80D3ACAD-7751-4DF9-AFEE-B3AFEC420210@oracle.com>
Message-ID: <CA+cQ+tSDLvKgLSq5ZuBnfSfwOGXs-Q=GQMfJDJ5XhY5o55tQ2Q@mail.gmail.com>

On Thu, Sep 8, 2016 at 3:13 PM, John Rose <john.r.rose at oracle.com> wrote:

> On Sep 8, 2016, at 2:48 PM, Krystal Mok <rednaxelafx at gmail.com> wrote:
>
>
> On Thu, Sep 8, 2016 at 9:32 AM, Vitaly Davidovich <vitalyd at gmail.com>
> wrote:
>
>>
>> By the way, and this is off-topic to this thread (apologies), but while
>> we're discussing marking classes/methods final, are there any other
>> footprint advantages to doing it even if CHA will devirt calls properly? So
>> removing the need to register dependencies is one, and is good.  Are the
>> vtables smaller for these cases? Anything else that's an added benefit
>> (from JVM runtime standpoint)?
>>
>
>  Well...nothing that really stands out.
>
> Removing the need for registering the dependencies is certainly a good
> thing, but it doesn't really matter that much.
>
> The vtable won't be necessarily be smaller, it depends. What's guaranteed
> is that a final method won't need a *new* vtable entry.
> Because "final" can be labeled on a method that's virtual in some base
> class, and is only "final" on some derived class. That vtable slot in the
> derived class is going to be inherited from the base class and then set to
> the overriding target, so no saving at all in this case.
>
>
> HotSpot is overly generous with v-table entries.  IIRC even privates get
> their own entries, for convoluted reasons.
>
> Thanks for the tip, John! You're right. Yikes...

  // private methods in classes always have a new entry in the vtable
  // specification interpretation since classic has
  // private methods not overriding
  // JDK8 adds private  methods in interfaces which require invokespecial
  if (target_method()->is_private()) {
    return true;
  }

- Kris


> ? John
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160908/e01bc119/attachment.html>

From kishor.kharbas at intel.com  Fri Sep  9 00:46:49 2016
From: kishor.kharbas at intel.com (Kharbas, Kishor)
Date: Fri, 9 Sep 2016 00:46:49 +0000
Subject: RFR(M) 8078122 : YMM registers upper 128 bits may get clobbered
	by a JNI call on windows
In-Reply-To: <F89640DCD01A85489FCBA68183A6A0F38EB957C3@ORSMSX116.amr.corp.intel.com>
References: <F89640DCD01A85489FCBA68183A6A0F38EB8A9A7@ORSMSX116.amr.corp.intel.com>
	<57BE1AD4.7070403@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F38EB8AC2E@ORSMSX116.amr.corp.intel.com>
	<F89640DCD01A85489FCBA68183A6A0F38EB91338@ORSMSX116.amr.corp.intel.com>
	<6aee0e7c-76a5-a920-7099-a3edc349f205@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F38EB92A1C@ORSMSX116.amr.corp.intel.com>
	<4af19c5d-9a7f-d18b-820b-6f3664b8183a@oracle.com>
	<7de8489c-943b-5ecf-48c1-0bffad101070@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F38EB957C3@ORSMSX116.amr.corp.intel.com>
Message-ID: <F89640DCD01A85489FCBA68183A6A0F38EB96FE4@ORSMSX116.amr.corp.intel.com>

Hi Vladimir,
I couldn't reproduce the error on my 32-bit Linux machine. The test was done on a Sandy bridge machine (has AVX instruction set)
Please advise how to proceed further.

Thanks
Kishor


-----Original Message-----
From: Kharbas, Kishor 
Sent: Tuesday, September 6, 2016 5:40 PM
To: Vladimir Kozlov <vladimir.kozlov at oracle.com>; hotspot-compiler-dev at openjdk.java.net
Cc: Kharbas, Kishor <kishor.kharbas at intel.com>
Subject: RE: RFR(M) 8078122 : YMM registers upper 128 bits may get clobbered by a JNI call on windows

Hi Vladimir,

The patch only touches code in _WIN64. I am having hard time to understand why the test fails for 32-bit Linux 

Btw, that test passes on Windows 64 platform. I am planning to test on Linux too.

Thanks
Kishor

-----Original Message-----
From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
Sent: Tuesday, September 6, 2016 2:31 PM
To: Kharbas, Kishor <kishor.kharbas at intel.com>; hotspot-compiler-dev at openjdk.java.net
Subject: Re: RFR(M) 8078122 : YMM registers upper 128 bits may get clobbered by a JNI call on windows

Next jtreg test failed on 32-bit Linux:

hotspot/test/compiler/runtime/Test7196199.java

----------System.err:(57/2416)----------
test_incrc: [41] = 8.081506E20 != 150000.0
test_incrc: [42] = 1.8632992E31 != 150000.0
test_incrc: [43] = 2.8397877E29 != 150000.0 ...

https://bugs.openjdk.java.net/browse/JDK-7196199

was related to Upper bits (64-255) of XMM (YMM) registers are not saved/restored in interrupt handle code during safepoint.

Looks like your changes are not enough.

Vladimir


On 9/6/16 10:12 AM, Vladimir Kozlov wrote:
> Good. I start testing these changes. I will push it if testing pass.
>
> Thanks,
> Vladimir
>
> On 9/2/16 3:07 PM, Kharbas, Kishor wrote:
>> Thanks Vladimir,
>>
>> I have updated the patch :
>> http://cr.openjdk.java.net/~vdeshpande/8078122/webrev.02/
>>
>> I looked for other places in src/cpu/x86/vm. I feel every case is 
>> covered.
>>
>> - Kishor
>>
>> -----Original Message-----
>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>> Sent: Thursday, September 1, 2016 11:39 AM
>> To: Kharbas, Kishor <kishor.kharbas at intel.com>; 
>> hotspot-compiler-dev at openjdk.java.net
>> Subject: Re: RFR(M) 8078122 : YMM registers upper 128 bits may get 
>> clobbered by a JNI call on windows
>>
>> Good. But looks like some code relied on old stack layout in stubs, 
>> for example sha256_AVX2():
>>
>> #ifndef _WIN64
>>    _XMM_SAVE_SIZE = 0,
>> #else
>>    _XMM_SAVE_SIZE = 8*16,
>> #endif
>>
>> Please, check that all other related code is fixed too. (I looked on 
>> all cases of _WIN64 in src/cpu/x86/vm/).
>>
>> Thanks,
>> Vladimir
>>
>> On 8/31/16 10:17 PM, Kharbas, Kishor wrote:
>>> Hello,
>>>
>>> I removed the unwanted save and restore of registers in the range
>>> XMM6-XMM31 from the x64_64 stubs.
>>> I also removed the #ifdef _WIN64 block from x86.ad file.
>>>
>>> Link to the new patch :
>>> http://cr.openjdk.java.net/~vdeshpande/8078122/webrev.01/
>>>
>>> Thanks
>>> Kishor
>>>
>>>
>>> -----Original Message-----
>>> From: Kharbas, Kishor
>>> Sent: Wednesday, August 24, 2016 6:24 PM
>>> To: Vladimir Kozlov <vladimir.kozlov at oracle.com>; 
>>> hotspot-compiler-dev at openjdk.java.net
>>> Cc: Kharbas, Kishor <kishor.kharbas at intel.com>
>>> Subject: RE: RFR(M) 8078122 : YMM registers upper 128 bits may get 
>>> clobbered by a JNI call on windows
>>>
>>> Thanks Vladimir for quick feedback.
>>> I will look into the stubs which save the registers in the range 
>>> XMM6-XMM31. Also the first comment makes perfect sense.
>>>
>>> Thanks
>>> Kishor
>>>
>>> -----Original Message-----
>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>> Sent: Wednesday, August 24, 2016 3:08 PM
>>> To: Kharbas, Kishor <kishor.kharbas at intel.com>; 
>>> hotspot-compiler-dev at openjdk.java.net
>>> Subject: Re: RFR(M) 8078122 : YMM registers upper 128 bits may get 
>>> clobbered by a JNI call on windows
>>>
>>> Hi Kishor,
>>>
>>> First, #ifdef _WIN64 is not needed anymore since calling convention 
>>> is similat to unix now.
>>>
>>> Second, I would like you to look more broadly. With this change we 
>>> don't need to preserve XMM6-XMM31 in our stubs for WIN64. I am not 
>>> sure that we can remove all #ifdef _WIN64 there but for most of them 
>>> I think we can do. Please, look.
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 8/24/16 2:40 PM, Kharbas, Kishor wrote:
>>>> Requesting the community to review the patch for
>>>> https://bugs.openjdk.java.net/browse/JDK-8078122
>>>>
>>>> Webrev : http://cr.openjdk.java.net/~vdeshpande/8078122/webrev.00
>>>>
>>>> The patch changes the definitions of registers XMM6-XMM31 for WIN64.
>>>>
>>>> Thank you.
>>>>
>>>> Kishor
>>>>

From cnewland at chrisnewland.com  Fri Sep  9 06:48:05 2016
From: cnewland at chrisnewland.com (Chris Newland)
Date: Fri, 9 Sep 2016 07:48:05 +0100
Subject: Clarification on hot_throw optimisation
Message-ID: <4a2a7e1c8db461198aa896cd24be45b1.squirrel@excalibur.xssl.net>

Hi all,

I'm adding support for highlighting the hot_throw HotSpot optimisation in
JITWatch (a LogCompilation visualiser) [1] and would like to ask if I've
understood it correctly please.

Example code:
https://github.com/AdoptOpenJDK/jitwatch/blob/master/core/src/main/resources/examples/HotThrow.java

Issue where I've tried to note my findings:
https://github.com/AdoptOpenJDK/jitwatch/issues/223

======================================
import java.util.Random;

public class HotThrow
{
    private Random random = new Random();

    public HotThrow()
    {
        StringBuilder builder = new StringBuilder();

        String string = "The quick brown fox jumps over the lazy dog";

        char[] chars = string.toCharArray();

        for (int i = 0 ; i < 1_000_000; i++)
        {
            int index = random.nextInt(100);

            char c = getChar(chars, index);

            builder.append(c);
        }

        System.out.println(builder.toString());
    }

    public char getChar(char[] chars, int index)
    {
        try
        {
            return chars[index];
        }
        catch(ArrayIndexOutOfBoundsException e)
        {
            return '*';
        }
    }

    public static void main(String[] args)
    {
        new HotThrow();
    }
}
======================================

I believe that the range check on the array index was eliminated in C2 but
hit a trap when index was out of range.

HotSpot then detected this as a hot throw in vm/opto/graphKit.cpp

case Deoptimization::Reason_range_check:
      ex_obj = env()->ArrayIndexOutOfBoundsException_instance();
      break;

and because there was a local exception handler it uses a pre-allocated
AIOOBE (without a stack trace?) and didn't deoptimise or drop back to the
interpreter.

JITWatch looks for LogCompilation like:

    <parse method="832" stamp="0.187" uses="10000">
      <observe total="-1" count="-1" trap="range_check"/>
      <observe that="has_exception_handlers"/>
      <bc code="52" bci="2"/>
      <uncommon_trap reason="null_check" bci="2" action="maybe_recompile"/>
      <observe count="-1" trap="range_check"/>
      <hot_throw reason="range_check" preallocated="1"/>
      <parse_done nodes="75" memory="31744" stamp="0.187" live="72"/>
    </parse>

and I then use bci reference and the method bytecode Exception table to
look up the exception type and highlight it in the JITWatch UI:
https://www.chrisnewland.com/images/jitwatch/release1.1/hotthrow.png

Is this correct?

I didn't quite understand the comment in graphKit

  // Note:   If the deopt count has blown up, the uncommon trap
  // runtime is going to flush this nmethod, not matter what.

Will the hot_throw optimisation stop working after a certain count? I've
not observed that yet.

Many thanks,

Chris
@chriswhocodes

[1] https://github.com/AdoptOpenJDK/jitwatch


From adinn at redhat.com  Fri Sep  9 08:34:52 2016
From: adinn at redhat.com (Andrew Dinn)
Date: Fri, 9 Sep 2016 09:34:52 +0100
Subject: [aarch64-port-dev ] RFR: AArch64: Fix JNI floating point argument
	handling
In-Reply-To: <CACO5V=YMYK37qSm+ot1s=tuOzXWn5rhUU3KztxQcX==NPMbBzA@mail.gmail.com>
References: <CACO5V=Z6i0tqCrEOjSeOxhq9vBePOCNu9E0fsk17+Mgt2Ushyw@mail.gmail.com>
	<1d9d7d75-a20e-4145-dfe6-e8ff8e3aea7c@redhat.com>
	<CACO5V=YMYK37qSm+ot1s=tuOzXWn5rhUU3KztxQcX==NPMbBzA@mail.gmail.com>
Message-ID: <8f6113af-fd97-08c1-776d-f37b289be060@redhat.com>

Hi Ningsheng,

On 08/09/16 10:20, Ningsheng Jian wrote:
> I have updated the webrev at:
> 
> http://people.linaro.org/~ningsheng.jian/jni-fix/webrev.01/
> 
> Please help to review it. It passed jtreg tests on my arm server with
> fastdebug build.

Andrew Haley is on holiday at the moment so I have reviewed this patch.
It looks fine and the test passes on my patched build.

This will need a sponsor from Oracle to get it supplied with the
necessary exemption for jdk9 and committed -- any takers from the [in
cc] compiler dev list?

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander

From doug.simon at oracle.com  Fri Sep  9 09:01:39 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Fri, 9 Sep 2016 11:01:39 +0200
Subject: RFR: 8165755: [JVMCI] replace use of vm_abort with vm_exit
Message-ID: <799CA13D-6BDC-4BF5-9241-515A684191F4@oracle.com>

Calling vm_abort from multiple threads can cause nasty crashes such as double free errors. We've seen this in Graal during JVMCI initialization when an unknown Graal option is encountered. Multiple compiler threads try to initialize JVMCI which fails with an exception indicating the bad option:

Uncaught exception at /scratch/graaluser/buildslave/buildlog/ci_executor/main/graal-jvmci-8/src/share/vm/jvmci/jvmciCompiler.cpp:127 
java.lang.ExceptionInInitializerError 
        at jdk.vm.ci.hotspot.HotSpotJVMCIRuntime.runtime(HotSpotJVMCIRuntime.java:85) 
        at jdk.vm.ci.runtime.JVMCI.initializeRuntime(Native Method) 
        at jdk.vm.ci.runtime.JVMCI.<clinit>(JVMCI.java:58) 
Caused by: java.lang.IllegalArgumentException: Could not find option OptSomethingThatDoesNotExcist 
        at com.oracle.graal.options.OptionsParser.parseOption(OptionsParser.java:134) 
        at com.oracle.graal.options.OptionsParser.parseOptions(OptionsParser.java:62) 
        at com.oracle.graal.hotspot.HotSpotGraalCompilerFactory.initializeOptions(HotSpotGraalCompilerFactory.java:156) 
        at com.oracle.graal.hotspot.HotSpotGraalCompilerFactory.onSelection(HotSpotGraalCompilerFactory.java:86) 
        at jdk.vm.ci.hotspot.HotSpotJVMCICompilerConfig.getCompilerFactory(HotSpotJVMCICompilerConfig.java:96) 
        at jdk.vm.ci.hotspot.HotSpotJVMCIRuntime.<init>(HotSpotJVMCIRuntime.java:277) 
        at jdk.vm.ci.hotspot.HotSpotJVMCIRuntime.<init>(HotSpotJVMCIRuntime.java:67) 
        at jdk.vm.ci.hotspot.HotSpotJVMCIRuntime$DelayedInit.<clinit>(HotSpotJVMCIRuntime.java:75) 
        at jdk.vm.ci.hotspot.HotSpotJVMCIRuntime.runtime(HotSpotJVMCIRuntime.java:85) 
        at jdk.vm.ci.runtime.JVMCI.initializeRuntime(Native Method) 
        at jdk.vm.ci.runtime.JVMCI.<clinit>(JVMCI.java:58) 

The native JVMCI code then tries to exit the VM by calling vm_abort. If multiple compiler threads do this concurrently, certain destructors can be called twice as shown by these thread dumps:

thread #26: tid = 0x0019, 0x00007fff84280124 libsystem_malloc.dylib`szone_size + 227, stop reason = signal SIGSTOP
  frame #0: 0x00007fff84280124 libsystem_malloc.dylib`szone_size + 227
  frame #1: 0x00007fff8427fed5 libsystem_malloc.dylib`free + 61
  frame #2: 0x000000010ac95963 libjvm.dylib`os::free(memblock=0x00007fedc86226e0, memflags=mtInternal) + 307 at os.cpp:711
  frame #3: 0x000000010a2afc54 libjvm.dylib`FreeHeap(p=0x00007fedc86226e0, memflags=mtInternal) + 52 at allocation.inline.hpp:93
  frame #4: 0x000000010acf0a9f libjvm.dylib`PerfData::~PerfData(this=0x00007fedc8622650) + 63 at perfData.cpp:116
  frame #5: 0x000000010acf0ae5 libjvm.dylib`PerfData::~PerfData(this=0x00007fedc8622650) + 21 at perfData.cpp:114
  frame #6: 0x000000010acf163d libjvm.dylib`PerfDataManager::destroy() + 109 at perfData.cpp:287
  frame #7: 0x000000010acf3f4d libjvm.dylib`perfMemory_exit() + 61 at perfMemory.cpp:74
  frame #8: 0x000000010ac9bb0d libjvm.dylib`os::shutdown() + 13 at os_bsd.cpp:1130
  frame #9: 0x000000010ac9bb55 libjvm.dylib`os::abort(dump_core=false) + 21 at os_bsd.cpp:1150
  frame #10: 0x000000010a9188e7 libjvm.dylib`vm_abort(dump_core=false) + 39 at java.cpp:666
  frame #11: 0x000000010aa4f1e7 libjvm.dylib`JVMCIRuntime::abort_on_pending_exception(exception=Handle @ 0x000070000175b208, message="Uncaught exception at /Users/dsimon/graal/graal-jvmci-8/src/share/vm/jvmci/jvmciCompiler.cpp:127", dump_core=false) + 167 at jvmciRuntime.cpp:992
  frame #12: 0x000000010aa17017 libjvm.dylib`JVMCICompiler::compile_method(this=0x00007fedcb203050, method=0x000070000175b8d8, entry_bci=-1, env=0x000070000175b8f0) + 311 at jvmciCompiler.cpp:127
  frame #13: 0x000000010a656cd2 libjvm.dylib`CompileBroker::invoke_compiler_on_method(task=0x00007fedc853ca30) + 1314 at compileBroker.cpp:2207

  thread #23: tid = 0x0016, 0x00007fff91fcb122 libsystem_kernel.dylib`__semwait_signal_nocancel + 10, stop reason = signal SIGSTOP
  frame #0: 0x00007fff91fcb122 libsystem_kernel.dylib`__semwait_signal_nocancel + 10
  frame #1: 0x00007fff9578c318 libsystem_c.dylib`nanosleep$NOCANCEL + 188
  frame #2: 0x00007fff957b62ce libsystem_c.dylib`usleep$NOCANCEL + 54
  frame #3: 0x00007fff957e46e9 libsystem_c.dylib`abort + 139
  frame #4: 0x00007fff8428c396 libsystem_malloc.dylib`szone_error + 626
  frame #5: 0x000000010ac95963 libjvm.dylib`os::free(memblock=0x00007fedc8601cd0, memflags=mtInternal) + 307 at os.cpp:711
  frame #6: 0x000000010a2afc54 libjvm.dylib`FreeHeap(p=0x00007fedc8601cd0, memflags=mtInternal) + 52 at allocation.inline.hpp:93
  frame #7: 0x000000010acf0a9f libjvm.dylib`PerfData::~PerfData(this=0x00007fedc8601c60) + 63 at perfData.cpp:116
  frame #8: 0x000000010acf0ae5 libjvm.dylib`PerfData::~PerfData(this=0x00007fedc8601c60) + 21 at perfData.cpp:114
  frame #9: 0x000000010acf163d libjvm.dylib`PerfDataManager::destroy() + 109 at perfData.cpp:287
  frame #10: 0x000000010acf3f4d libjvm.dylib`perfMemory_exit() + 61 at perfMemory.cpp:74
  frame #11: 0x000000010ac9bb0d libjvm.dylib`os::shutdown() + 13 at os_bsd.cpp:1130
  frame #12: 0x000000010ac9bb55 libjvm.dylib`os::abort(dump_core=false) + 21 at os_bsd.cpp:1150
  frame #13: 0x000000010a9188e7 libjvm.dylib`vm_abort(dump_core=false) + 39 at java.cpp:666
  frame #14: 0x000000010aa4f1e7 libjvm.dylib`JVMCIRuntime::abort_on_pending_exception(exception=Handle @ 0x0000700001452208, message="Uncaught exception at /Users/dsimon/graal/graal-jvmci-8/src/share/vm/jvmci/jvmciCompiler.cpp:127", dump_core=false) + 167 at jvmciRuntime.cpp:992
  frame #15: 0x000000010aa17017 libjvm.dylib`JVMCICompiler::compile_method(this=0x00007fedcb203050, method=0x00007000014528d8, entry_bci=-1, env=0x00007000014528f0) + 311 at jvmciCompiler.cpp:127
  frame #16: 0x000000010a656cd2 libjvm.dylib`CompileBroker::invoke_compiler_on_method(task=0x00007fedc862a320) + 1314 at compileBroker.cpp:2207


This webrev replaces calls to vm_abort() with before_exit() + vm_exit(). The latter is thread safe.

https://bugs.openjdk.java.net/browse/JDK-8165755
http://cr.openjdk.java.net/~dnsimon/8165755/

-Doug

From aph at redhat.com  Fri Sep  9 10:10:23 2016
From: aph at redhat.com (Andrew Haley)
Date: Fri, 9 Sep 2016 11:10:23 +0100
Subject: [aarch64-port-dev ] RFR: AArch64: Fix JNI floating point argument
	handling
In-Reply-To: <8f6113af-fd97-08c1-776d-f37b289be060@redhat.com>
References: <CACO5V=Z6i0tqCrEOjSeOxhq9vBePOCNu9E0fsk17+Mgt2Ushyw@mail.gmail.com>
	<1d9d7d75-a20e-4145-dfe6-e8ff8e3aea7c@redhat.com>
	<CACO5V=YMYK37qSm+ot1s=tuOzXWn5rhUU3KztxQcX==NPMbBzA@mail.gmail.com>
	<8f6113af-fd97-08c1-776d-f37b289be060@redhat.com>
Message-ID: <0e8ac5bb-0ab7-1120-a1d3-a15bf786c6da@redhat.com>

On 09/09/16 09:34, Andrew Dinn wrote:
> Hi Ningsheng,
> 
> On 08/09/16 10:20, Ningsheng Jian wrote:
>> I have updated the webrev at:
>>
>> http://people.linaro.org/~ningsheng.jian/jni-fix/webrev.01/
>>
>> Please help to review it. It passed jtreg tests on my arm server with
>> fastdebug build.
> 
> Andrew Haley is on holiday at the moment so I have reviewed this patch.
> It looks fine and the test passes on my patched build.

It's good.  I'm surprised (not to say appalled) that we never noticed
this before now.

Andrew.


From adinn at redhat.com  Fri Sep  9 12:59:11 2016
From: adinn at redhat.com (Andrew Dinn)
Date: Fri, 9 Sep 2016 13:59:11 +0100
Subject: [aarch64-port-dev ] RFR: AArch64: Fix JNI floating point argument
	handling
In-Reply-To: <0e8ac5bb-0ab7-1120-a1d3-a15bf786c6da@redhat.com>
References: <CACO5V=Z6i0tqCrEOjSeOxhq9vBePOCNu9E0fsk17+Mgt2Ushyw@mail.gmail.com>
	<1d9d7d75-a20e-4145-dfe6-e8ff8e3aea7c@redhat.com>
	<CACO5V=YMYK37qSm+ot1s=tuOzXWn5rhUU3KztxQcX==NPMbBzA@mail.gmail.com>
	<8f6113af-fd97-08c1-776d-f37b289be060@redhat.com>
	<0e8ac5bb-0ab7-1120-a1d3-a15bf786c6da@redhat.com>
Message-ID: <3f7844ac-8bd6-207d-cd81-ae93c8391dcf@redhat.com>

On 09/09/16 11:10, Andrew Haley wrote:
> On 09/09/16 09:34, Andrew Dinn wrote:
>> Hi Ningsheng,
>>
>> On 08/09/16 10:20, Ningsheng Jian wrote:
>>> I have updated the webrev at:
>>>
>>> http://people.linaro.org/~ningsheng.jian/jni-fix/webrev.01/
>>>
>>> Please help to review it. It passed jtreg tests on my arm server with
>>> fastdebug build.
>>
>> Andrew Haley is on holiday at the moment so I have reviewed this patch.
>> It looks fine and the test passes on my patched build.
> 
> It's good.  I'm surprised (not to say appalled) that we never noticed
> this before now.

Yeah, I was going to run hg blame on the jdk8 tree to see who wrote such
obviously broken code -- but I think it was me so I decided not to bother!

Do we still need someone from Oracle to sponsor this and provide it with
an exemption before it can go into JDK9? Or can you commit it?

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander

From aph at redhat.com  Fri Sep  9 13:30:39 2016
From: aph at redhat.com (Andrew Haley)
Date: Fri, 9 Sep 2016 14:30:39 +0100
Subject: [aarch64-port-dev ] RFR: AArch64: Fix JNI floating point argument
	handling
In-Reply-To: <3f7844ac-8bd6-207d-cd81-ae93c8391dcf@redhat.com>
References: <CACO5V=Z6i0tqCrEOjSeOxhq9vBePOCNu9E0fsk17+Mgt2Ushyw@mail.gmail.com>
	<1d9d7d75-a20e-4145-dfe6-e8ff8e3aea7c@redhat.com>
	<CACO5V=YMYK37qSm+ot1s=tuOzXWn5rhUU3KztxQcX==NPMbBzA@mail.gmail.com>
	<8f6113af-fd97-08c1-776d-f37b289be060@redhat.com>
	<0e8ac5bb-0ab7-1120-a1d3-a15bf786c6da@redhat.com>
	<3f7844ac-8bd6-207d-cd81-ae93c8391dcf@redhat.com>
Message-ID: <da622fdf-b8cb-749a-6c7b-ad79debdbc08@redhat.com>

On 09/09/16 13:59, Andrew Dinn wrote:
> Yeah, I was going to run hg blame on the jdk8 tree to see who wrote such
> obviously broken code -- but I think it was me so I decided not to bother!
> 
> Do we still need someone from Oracle to sponsor this and provide it with
> an exemption before it can go into JDK9? Or can you commit it?

This is a serious bug so it must go in.  We need sponsorship for
the test case.  If we don't get that sponsorship we can commit into
the aarch64 dir, but let's try for sponsorship first.

Andrew.


From cthalinger at twitter.com  Fri Sep  9 17:48:28 2016
From: cthalinger at twitter.com (Christian Thalinger)
Date: Fri, 9 Sep 2016 07:48:28 -1000
Subject: RFR: 8165755: [JVMCI] replace use of vm_abort with vm_exit
In-Reply-To: <799CA13D-6BDC-4BF5-9241-515A684191F4@oracle.com>
References: <799CA13D-6BDC-4BF5-9241-515A684191F4@oracle.com>
Message-ID: <6391B00B-AFBF-410C-A6A1-2ED95B35EBEB@twitter.com>

I think this looks fine but maybe we should ask the runtime folks.

> On Sep 8, 2016, at 11:01 PM, Doug Simon <doug.simon at oracle.com> wrote:
> 
> Calling vm_abort from multiple threads can cause nasty crashes such as double free errors. We've seen this in Graal during JVMCI initialization when an unknown Graal option is encountered. Multiple compiler threads try to initialize JVMCI which fails with an exception indicating the bad option:
> 
> Uncaught exception at /scratch/graaluser/buildslave/buildlog/ci_executor/main/graal-jvmci-8/src/share/vm/jvmci/jvmciCompiler.cpp:127 
> java.lang.ExceptionInInitializerError 
>        at jdk.vm.ci.hotspot.HotSpotJVMCIRuntime.runtime(HotSpotJVMCIRuntime.java:85) 
>        at jdk.vm.ci.runtime.JVMCI.initializeRuntime(Native Method) 
>        at jdk.vm.ci.runtime.JVMCI.<clinit>(JVMCI.java:58) 
> Caused by: java.lang.IllegalArgumentException: Could not find option OptSomethingThatDoesNotExcist 
>        at com.oracle.graal.options.OptionsParser.parseOption(OptionsParser.java:134) 
>        at com.oracle.graal.options.OptionsParser.parseOptions(OptionsParser.java:62) 
>        at com.oracle.graal.hotspot.HotSpotGraalCompilerFactory.initializeOptions(HotSpotGraalCompilerFactory.java:156) 
>        at com.oracle.graal.hotspot.HotSpotGraalCompilerFactory.onSelection(HotSpotGraalCompilerFactory.java:86) 
>        at jdk.vm.ci.hotspot.HotSpotJVMCICompilerConfig.getCompilerFactory(HotSpotJVMCICompilerConfig.java:96) 
>        at jdk.vm.ci.hotspot.HotSpotJVMCIRuntime.<init>(HotSpotJVMCIRuntime.java:277) 
>        at jdk.vm.ci.hotspot.HotSpotJVMCIRuntime.<init>(HotSpotJVMCIRuntime.java:67) 
>        at jdk.vm.ci.hotspot.HotSpotJVMCIRuntime$DelayedInit.<clinit>(HotSpotJVMCIRuntime.java:75) 
>        at jdk.vm.ci.hotspot.HotSpotJVMCIRuntime.runtime(HotSpotJVMCIRuntime.java:85) 
>        at jdk.vm.ci.runtime.JVMCI.initializeRuntime(Native Method) 
>        at jdk.vm.ci.runtime.JVMCI.<clinit>(JVMCI.java:58) 
> 
> The native JVMCI code then tries to exit the VM by calling vm_abort. If multiple compiler threads do this concurrently, certain destructors can be called twice as shown by these thread dumps:
> 
> thread #26: tid = 0x0019, 0x00007fff84280124 libsystem_malloc.dylib`szone_size + 227, stop reason = signal SIGSTOP
>  frame #0: 0x00007fff84280124 libsystem_malloc.dylib`szone_size + 227
>  frame #1: 0x00007fff8427fed5 libsystem_malloc.dylib`free + 61
>  frame #2: 0x000000010ac95963 libjvm.dylib`os::free(memblock=0x00007fedc86226e0, memflags=mtInternal) + 307 at os.cpp:711
>  frame #3: 0x000000010a2afc54 libjvm.dylib`FreeHeap(p=0x00007fedc86226e0, memflags=mtInternal) + 52 at allocation.inline.hpp:93
>  frame #4: 0x000000010acf0a9f libjvm.dylib`PerfData::~PerfData(this=0x00007fedc8622650) + 63 at perfData.cpp:116
>  frame #5: 0x000000010acf0ae5 libjvm.dylib`PerfData::~PerfData(this=0x00007fedc8622650) + 21 at perfData.cpp:114
>  frame #6: 0x000000010acf163d libjvm.dylib`PerfDataManager::destroy() + 109 at perfData.cpp:287
>  frame #7: 0x000000010acf3f4d libjvm.dylib`perfMemory_exit() + 61 at perfMemory.cpp:74
>  frame #8: 0x000000010ac9bb0d libjvm.dylib`os::shutdown() + 13 at os_bsd.cpp:1130
>  frame #9: 0x000000010ac9bb55 libjvm.dylib`os::abort(dump_core=false) + 21 at os_bsd.cpp:1150
>  frame #10: 0x000000010a9188e7 libjvm.dylib`vm_abort(dump_core=false) + 39 at java.cpp:666
>  frame #11: 0x000000010aa4f1e7 libjvm.dylib`JVMCIRuntime::abort_on_pending_exception(exception=Handle @ 0x000070000175b208, message="Uncaught exception at /Users/dsimon/graal/graal-jvmci-8/src/share/vm/jvmci/jvmciCompiler.cpp:127", dump_core=false) + 167 at jvmciRuntime.cpp:992
>  frame #12: 0x000000010aa17017 libjvm.dylib`JVMCICompiler::compile_method(this=0x00007fedcb203050, method=0x000070000175b8d8, entry_bci=-1, env=0x000070000175b8f0) + 311 at jvmciCompiler.cpp:127
>  frame #13: 0x000000010a656cd2 libjvm.dylib`CompileBroker::invoke_compiler_on_method(task=0x00007fedc853ca30) + 1314 at compileBroker.cpp:2207
> 
>  thread #23: tid = 0x0016, 0x00007fff91fcb122 libsystem_kernel.dylib`__semwait_signal_nocancel + 10, stop reason = signal SIGSTOP
>  frame #0: 0x00007fff91fcb122 libsystem_kernel.dylib`__semwait_signal_nocancel + 10
>  frame #1: 0x00007fff9578c318 libsystem_c.dylib`nanosleep$NOCANCEL + 188
>  frame #2: 0x00007fff957b62ce libsystem_c.dylib`usleep$NOCANCEL + 54
>  frame #3: 0x00007fff957e46e9 libsystem_c.dylib`abort + 139
>  frame #4: 0x00007fff8428c396 libsystem_malloc.dylib`szone_error + 626
>  frame #5: 0x000000010ac95963 libjvm.dylib`os::free(memblock=0x00007fedc8601cd0, memflags=mtInternal) + 307 at os.cpp:711
>  frame #6: 0x000000010a2afc54 libjvm.dylib`FreeHeap(p=0x00007fedc8601cd0, memflags=mtInternal) + 52 at allocation.inline.hpp:93
>  frame #7: 0x000000010acf0a9f libjvm.dylib`PerfData::~PerfData(this=0x00007fedc8601c60) + 63 at perfData.cpp:116
>  frame #8: 0x000000010acf0ae5 libjvm.dylib`PerfData::~PerfData(this=0x00007fedc8601c60) + 21 at perfData.cpp:114
>  frame #9: 0x000000010acf163d libjvm.dylib`PerfDataManager::destroy() + 109 at perfData.cpp:287
>  frame #10: 0x000000010acf3f4d libjvm.dylib`perfMemory_exit() + 61 at perfMemory.cpp:74
>  frame #11: 0x000000010ac9bb0d libjvm.dylib`os::shutdown() + 13 at os_bsd.cpp:1130
>  frame #12: 0x000000010ac9bb55 libjvm.dylib`os::abort(dump_core=false) + 21 at os_bsd.cpp:1150
>  frame #13: 0x000000010a9188e7 libjvm.dylib`vm_abort(dump_core=false) + 39 at java.cpp:666
>  frame #14: 0x000000010aa4f1e7 libjvm.dylib`JVMCIRuntime::abort_on_pending_exception(exception=Handle @ 0x0000700001452208, message="Uncaught exception at /Users/dsimon/graal/graal-jvmci-8/src/share/vm/jvmci/jvmciCompiler.cpp:127", dump_core=false) + 167 at jvmciRuntime.cpp:992
>  frame #15: 0x000000010aa17017 libjvm.dylib`JVMCICompiler::compile_method(this=0x00007fedcb203050, method=0x00007000014528d8, entry_bci=-1, env=0x00007000014528f0) + 311 at jvmciCompiler.cpp:127
>  frame #16: 0x000000010a656cd2 libjvm.dylib`CompileBroker::invoke_compiler_on_method(task=0x00007fedc862a320) + 1314 at compileBroker.cpp:2207
> 
> 
> This webrev replaces calls to vm_abort() with before_exit() + vm_exit(). The latter is thread safe.
> 
> https://bugs.openjdk.java.net/browse/JDK-8165755
> http://cr.openjdk.java.net/~dnsimon/8165755/
> 
> -Doug


From doug.simon at oracle.com  Fri Sep  9 18:33:40 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Fri, 9 Sep 2016 20:33:40 +0200
Subject: RFR: 8165755: [JVMCI] replace use of vm_abort with vm_exit
In-Reply-To: <6391B00B-AFBF-410C-A6A1-2ED95B35EBEB@twitter.com>
References: <799CA13D-6BDC-4BF5-9241-515A684191F4@oracle.com>
	<6391B00B-AFBF-410C-A6A1-2ED95B35EBEB@twitter.com>
Message-ID: <2F360221-7E6B-43BB-B69C-69A17777E5F2@oracle.com>

Can someone from the runtime team confirm that using vm_exit (instead of vm_abort) is the best way to stop the VM when JVMCI initialization fails (e.g., when invalid JVMCI options are provided on the command line). Thanks!

-Doug


> On 09 Sep 2016, at 19:48, Christian Thalinger <cthalinger at twitter.com> wrote:
> 
> I think this looks fine but maybe we should ask the runtime folks.
> 
>> On Sep 8, 2016, at 11:01 PM, Doug Simon <doug.simon at oracle.com> wrote:
>> 
>> Calling vm_abort from multiple threads can cause nasty crashes such as double free errors. We've seen this in Graal during JVMCI initialization when an unknown Graal option is encountered. Multiple compiler threads try to initialize JVMCI which fails with an exception indicating the bad option:
>> 
>> Uncaught exception at /scratch/graaluser/buildslave/buildlog/ci_executor/main/graal-jvmci-8/src/share/vm/jvmci/jvmciCompiler.cpp:127 
>> java.lang.ExceptionInInitializerError 
>>       at jdk.vm.ci.hotspot.HotSpotJVMCIRuntime.runtime(HotSpotJVMCIRuntime.java:85) 
>>       at jdk.vm.ci.runtime.JVMCI.initializeRuntime(Native Method) 
>>       at jdk.vm.ci.runtime.JVMCI.<clinit>(JVMCI.java:58) 
>> Caused by: java.lang.IllegalArgumentException: Could not find option OptSomethingThatDoesNotExcist 
>>       at com.oracle.graal.options.OptionsParser.parseOption(OptionsParser.java:134) 
>>       at com.oracle.graal.options.OptionsParser.parseOptions(OptionsParser.java:62) 
>>       at com.oracle.graal.hotspot.HotSpotGraalCompilerFactory.initializeOptions(HotSpotGraalCompilerFactory.java:156) 
>>       at com.oracle.graal.hotspot.HotSpotGraalCompilerFactory.onSelection(HotSpotGraalCompilerFactory.java:86) 
>>       at jdk.vm.ci.hotspot.HotSpotJVMCICompilerConfig.getCompilerFactory(HotSpotJVMCICompilerConfig.java:96) 
>>       at jdk.vm.ci.hotspot.HotSpotJVMCIRuntime.<init>(HotSpotJVMCIRuntime.java:277) 
>>       at jdk.vm.ci.hotspot.HotSpotJVMCIRuntime.<init>(HotSpotJVMCIRuntime.java:67) 
>>       at jdk.vm.ci.hotspot.HotSpotJVMCIRuntime$DelayedInit.<clinit>(HotSpotJVMCIRuntime.java:75) 
>>       at jdk.vm.ci.hotspot.HotSpotJVMCIRuntime.runtime(HotSpotJVMCIRuntime.java:85) 
>>       at jdk.vm.ci.runtime.JVMCI.initializeRuntime(Native Method) 
>>       at jdk.vm.ci.runtime.JVMCI.<clinit>(JVMCI.java:58) 
>> 
>> The native JVMCI code then tries to exit the VM by calling vm_abort. If multiple compiler threads do this concurrently, certain destructors can be called twice as shown by these thread dumps:
>> 
>> thread #26: tid = 0x0019, 0x00007fff84280124 libsystem_malloc.dylib`szone_size + 227, stop reason = signal SIGSTOP
>> frame #0: 0x00007fff84280124 libsystem_malloc.dylib`szone_size + 227
>> frame #1: 0x00007fff8427fed5 libsystem_malloc.dylib`free + 61
>> frame #2: 0x000000010ac95963 libjvm.dylib`os::free(memblock=0x00007fedc86226e0, memflags=mtInternal) + 307 at os.cpp:711
>> frame #3: 0x000000010a2afc54 libjvm.dylib`FreeHeap(p=0x00007fedc86226e0, memflags=mtInternal) + 52 at allocation.inline.hpp:93
>> frame #4: 0x000000010acf0a9f libjvm.dylib`PerfData::~PerfData(this=0x00007fedc8622650) + 63 at perfData.cpp:116
>> frame #5: 0x000000010acf0ae5 libjvm.dylib`PerfData::~PerfData(this=0x00007fedc8622650) + 21 at perfData.cpp:114
>> frame #6: 0x000000010acf163d libjvm.dylib`PerfDataManager::destroy() + 109 at perfData.cpp:287
>> frame #7: 0x000000010acf3f4d libjvm.dylib`perfMemory_exit() + 61 at perfMemory.cpp:74
>> frame #8: 0x000000010ac9bb0d libjvm.dylib`os::shutdown() + 13 at os_bsd.cpp:1130
>> frame #9: 0x000000010ac9bb55 libjvm.dylib`os::abort(dump_core=false) + 21 at os_bsd.cpp:1150
>> frame #10: 0x000000010a9188e7 libjvm.dylib`vm_abort(dump_core=false) + 39 at java.cpp:666
>> frame #11: 0x000000010aa4f1e7 libjvm.dylib`JVMCIRuntime::abort_on_pending_exception(exception=Handle @ 0x000070000175b208, message="Uncaught exception at /Users/dsimon/graal/graal-jvmci-8/src/share/vm/jvmci/jvmciCompiler.cpp:127", dump_core=false) + 167 at jvmciRuntime.cpp:992
>> frame #12: 0x000000010aa17017 libjvm.dylib`JVMCICompiler::compile_method(this=0x00007fedcb203050, method=0x000070000175b8d8, entry_bci=-1, env=0x000070000175b8f0) + 311 at jvmciCompiler.cpp:127
>> frame #13: 0x000000010a656cd2 libjvm.dylib`CompileBroker::invoke_compiler_on_method(task=0x00007fedc853ca30) + 1314 at compileBroker.cpp:2207
>> 
>> thread #23: tid = 0x0016, 0x00007fff91fcb122 libsystem_kernel.dylib`__semwait_signal_nocancel + 10, stop reason = signal SIGSTOP
>> frame #0: 0x00007fff91fcb122 libsystem_kernel.dylib`__semwait_signal_nocancel + 10
>> frame #1: 0x00007fff9578c318 libsystem_c.dylib`nanosleep$NOCANCEL + 188
>> frame #2: 0x00007fff957b62ce libsystem_c.dylib`usleep$NOCANCEL + 54
>> frame #3: 0x00007fff957e46e9 libsystem_c.dylib`abort + 139
>> frame #4: 0x00007fff8428c396 libsystem_malloc.dylib`szone_error + 626
>> frame #5: 0x000000010ac95963 libjvm.dylib`os::free(memblock=0x00007fedc8601cd0, memflags=mtInternal) + 307 at os.cpp:711
>> frame #6: 0x000000010a2afc54 libjvm.dylib`FreeHeap(p=0x00007fedc8601cd0, memflags=mtInternal) + 52 at allocation.inline.hpp:93
>> frame #7: 0x000000010acf0a9f libjvm.dylib`PerfData::~PerfData(this=0x00007fedc8601c60) + 63 at perfData.cpp:116
>> frame #8: 0x000000010acf0ae5 libjvm.dylib`PerfData::~PerfData(this=0x00007fedc8601c60) + 21 at perfData.cpp:114
>> frame #9: 0x000000010acf163d libjvm.dylib`PerfDataManager::destroy() + 109 at perfData.cpp:287
>> frame #10: 0x000000010acf3f4d libjvm.dylib`perfMemory_exit() + 61 at perfMemory.cpp:74
>> frame #11: 0x000000010ac9bb0d libjvm.dylib`os::shutdown() + 13 at os_bsd.cpp:1130
>> frame #12: 0x000000010ac9bb55 libjvm.dylib`os::abort(dump_core=false) + 21 at os_bsd.cpp:1150
>> frame #13: 0x000000010a9188e7 libjvm.dylib`vm_abort(dump_core=false) + 39 at java.cpp:666
>> frame #14: 0x000000010aa4f1e7 libjvm.dylib`JVMCIRuntime::abort_on_pending_exception(exception=Handle @ 0x0000700001452208, message="Uncaught exception at /Users/dsimon/graal/graal-jvmci-8/src/share/vm/jvmci/jvmciCompiler.cpp:127", dump_core=false) + 167 at jvmciRuntime.cpp:992
>> frame #15: 0x000000010aa17017 libjvm.dylib`JVMCICompiler::compile_method(this=0x00007fedcb203050, method=0x00007000014528d8, entry_bci=-1, env=0x00007000014528f0) + 311 at jvmciCompiler.cpp:127
>> frame #16: 0x000000010a656cd2 libjvm.dylib`CompileBroker::invoke_compiler_on_method(task=0x00007fedc862a320) + 1314 at compileBroker.cpp:2207
>> 
>> 
>> This webrev replaces calls to vm_abort() with before_exit() + vm_exit(). The latter is thread safe.
>> 
>> https://bugs.openjdk.java.net/browse/JDK-8165755
>> http://cr.openjdk.java.net/~dnsimon/8165755/
>> 
>> -Doug
> 


From jamsheed.c.m at oracle.com  Fri Sep  9 22:23:00 2016
From: jamsheed.c.m at oracle.com (Jamsheed C m)
Date: Sat, 10 Sep 2016 03:53:00 +0530
Subject: RFR: 8134389: Crash in HotSpot with jvm.dll+0x42b48
	ciObjectFactory::create_new_metadata
In-Reply-To: <bad02f78-e54b-b61e-72db-f435c02c5e10@oracle.com>
References: <bad02f78-e54b-b61e-72db-f435c02c5e10@oracle.com>
Message-ID: <05c82c51-9525-eec7-206e-a265c7d47194@oracle.com>

adding a little more description as per my understanding

This issue can happen only for compiled lforms not inlined case

there are two scenarios.
1) no compiled lforms inlined
2) some compiled lforms are inlined or  final method is not inlined 
(linkTo* not inlined)..    (i.e partially inlined)

in all these cases *Invoke instruction* will be *return Value*. and will 
have erased type.
so we reify return type either by type casting(for partially inlined 
case) or by directly pulling from callsite MT.

Best Regards,

Jamsheed


On 9/8/2016 3:26 PM, Jamsheed C m wrote:
> Hi All,
>
> bugid: https://bugs.openjdk.java.net/browse/JDK-8134389
>
> webrev: http://cr.openjdk.java.net/~jcm/8134389/webrev.00/
>
> return type information is not available in lforms, this causes 
> contradictions in operation like  store indexed. mh _linkTo* site arg 
> type casting. etc..
>
> fix: TypeCast to declared return type at lform return.
>
> Best Regards,
>
> Jamsheed
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160910/aec68b6f/attachment.html>

From jamsheed.c.m at oracle.com  Sun Sep 11 11:51:43 2016
From: jamsheed.c.m at oracle.com (Jamsheed C m)
Date: Sun, 11 Sep 2016 17:21:43 +0530
Subject: RFR: 8134389: Crash in HotSpot with jvm.dll+0x42b48
	ciObjectFactory::create_new_metadata
In-Reply-To: <05c82c51-9525-eec7-206e-a265c7d47194@oracle.com>
References: <bad02f78-e54b-b61e-72db-f435c02c5e10@oracle.com>
	<05c82c51-9525-eec7-206e-a265c7d47194@oracle.com>
Message-ID: <7c1a8b01-b4ec-ea23-b59a-500c1bfd5dbc@oracle.com>

i made some changes to my fix. webrev is updated in place.

pit results with latest modification updated in bug(not still completed)

Best Regards,

Jamsheed


On 9/10/2016 3:53 AM, Jamsheed C m wrote:
>
> adding a little more description as per my understanding
>
> This issue can happen only for compiled lforms not inlined case
>
> there are two scenarios.
> 1) no compiled lforms inlined
> 2) some compiled lforms are inlined or  final method is not inlined 
> (linkTo* not inlined)..    (i.e partially inlined)
>
> in all these cases *Invoke instruction* will be *return Value*. and 
> will have erased type.
> so we reify return type either by type casting(for partially inlined 
> case) or by directly pulling from callsite MT.
>
> Best Regards,
>
> Jamsheed
>
>
> On 9/8/2016 3:26 PM, Jamsheed C m wrote:
>> Hi All,
>>
>> bugid: https://bugs.openjdk.java.net/browse/JDK-8134389
>>
>> webrev: http://cr.openjdk.java.net/~jcm/8134389/webrev.00/
>>
>> return type information is not available in lforms, this causes 
>> contradictions in operation like  store indexed. mh _linkTo* site arg 
>> type casting. etc..
>>
>> fix: TypeCast to declared return type at lform return.
>>
>> Best Regards,
>>
>> Jamsheed
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160911/6aa88730/attachment.html>

From volker.simonis at gmail.com  Mon Sep 12 16:35:24 2016
From: volker.simonis at gmail.com (Volker Simonis)
Date: Mon, 12 Sep 2016 18:35:24 +0200
Subject: RFR(S): 8159611: C2: ArrayCopy elimination skips required
	parameter checks
In-Reply-To: <7ce01d28-13f5-098a-9898-080f8258881d@oracle.com>
References: <CA+3eh10fWqYQbL-eKEk=ud26pR5C1hM6Z_S0=ROJe-TnipKcNQ@mail.gmail.com>
	<57B2A380.6000408@oracle.com>
	<CA+3eh10yxdunPPYRptfTghA6LKRc5iO+VU=EsFr2PJgQmexQag@mail.gmail.com>
	<41851a79-5ffe-2b9d-504a-6a2301de5384@oracle.com>
	<CA+3eh12Jmcnt83YZ3cWaX1+JdSvY2pTNaLC=uQSLmK-6oGMZHA@mail.gmail.com>
	<7ce01d28-13f5-098a-9898-080f8258881d@oracle.com>
Message-ID: <CA+3eh11qwNSwAdtxq1QyqRFcHpLsLSnKFSRH_5Vx8Zi8u5Ci_A@mail.gmail.com>

Sorry for the long delay...

Here's my new version:

http://cr.openjdk.java.net/~simonis/webrevs/2016/8159611.v3/

I've actually changed PhaseMacroExpand::expand_arraycopy_node() such
that it calls generate_arraycopy() with 'length_never_negative' set to
true if EliminateAllocations is true (in this case we already checked
in LibraryCallKit::inline_arraycopy() that 'length' is not negative).
This way I could leave generate_arraycopy() untouched.

The generated code now looks as follows:

Original version (without 'length < 0' check):

0a7   B5: #    B17 B6 <- B4  Freq: 0,999998
0a7       cmpl    R9, R11    # unsigned
0aa       jb,u  B17  P=0,000001 C=-1,000000
...
0da   B7: #    B18 B8 <- B6 B12 B13  Freq: 0,999997
0da       movl    R11, [rsp + #8]    # spill
0df       testl   R11, R11
0e2       jle     B18  P=0,000001 C=-1,000000
...
0e8   B8: #    B9 <- B7  Freq: 0,999996
0f9       call_leaf_nofp,runtime  oop_disjoint_arraycopy
...
106   B9: #    B10 <- B8 B18 B20  Freq: 0,999997
113       ret
...
184   B17: #    N1 <- B4 B5  Freq: 2,01328e-06
193       call,static  wrapper for:
uncommon_trap(reason='intrinsic_or_type_checked_inlining'
action='make_not_entrant' debug_id='0')

19d   B18: #    B9 B19 <- B7  Freq: 9,99997e-07
19d       testl   R11, R11
1a0       jge     B9  P=0,999999 C=-1,000000
1a0
1a6   B19: #    B22 B20 <- B18  Freq: 9,99997e-13
1a6       movq    RSI, R8    # spill
1a9       movl    RDX, #1    # int
1ae       movq    RCX, R10    # spill
1b1       movl    R8, #1    # int
1b7       movl    R9, R11    # spill
          nop     # 1 bytes pad for loops and calls
1bb       call,static  wrapper for: slow_arraycopy

In B5 there's a check if 'offset+length' is still in the array range.
If not we jump to the uncommon trap in B17.
In B7 there's the first check from
PhaseMacroExpand::generate_arraycopy() (i.e.
generate_nonpositive_guard()). If 'length is less than or equal to
zero we jump to B18 where there's the second check from
PhaseMacroExpand::generate_arraycopy() (i.e.
generate_negative_guard()). If 'length' is  zero, we jump to B9 and
return. Otherwise we fall into B19 from where we call slow_arraycopy.
slow_arraycopy (which is generated in ObjArrayKlass::copy_array() will
throw an AIOOB exception if 'length' is negative.

The new version now looks as follows:

0a2   B5: #    B19 B6 <- B4  Freq: 0,999998
0a2       cmpl    R10, RCX    # unsigned
0a5       jb,u  B19  P=0,000001 C=-1,000000
0a5
0ab   B6: #    B20 B7 <- B5  Freq: 0,999997
0ab       movl    R10, [rsp + #0]    # spill
0af       testl   R10, R10
0b2       jl     B20  P=0,000001 C=-1,000000
0b2
...
0e2   B8: #    B10 B9 <- B7 B13 B14  Freq: 0,999996
0e2       testl   R10, R10
0e5       je,s   B10  P=0,000001 C=-1,000000
...
0e7   B9: #    B10 <- B8  Freq: 0,999995
0f8       call_leaf_nofp,runtime  oop_disjoint_arraycopy
...
105   B10: #    B11 <- B9 B8  Freq: 0,999996
112       ret
...
18e   B19: #    B20 <- B5  Freq: 9,99998e-07
192   B20: #    N1 <- B18 B19 B6  Freq: 3,01327e-06
1a3       call,static  wrapper for:
uncommon_trap(reason='intrinsic_or_type_checked_inlining'
action='make_not_entrant' debug_id='0')

B5 is like before, but is now followed by the extra check for 'length'
being not negative in B6. In B8 we we now have the first check (i.e.
generate_negative_guard()) from
PhaseMacroExpand::generate_arraycopy(). It directly checks if 'length'
is zero and jumps to B10 (i.e. returns) if so. Otherwise we fall
directly into oop_disjoint_arraycopy(). There's no need to check for
'length' being negative and calling 'slow_arraycopy' because this case
is already handled before now (in B6).

Is this OK now?

Thank you and best regards,
Volker


On Fri, Aug 26, 2016 at 3:51 AM, Vladimir Kozlov
<vladimir.kozlov at oracle.com> wrote:
> Looks good.
>
> Check does not fold because it is different: LT vs LE.
>
> Actually there are 3 checks together with yours (see
> PhaseMacroExpand::generate_arraycopy()):
>
>   Node* not_pos = generate_nonpositive_guard(ctrl, copy_length,
> length_never_negative);
>   if (not_pos != NULL) {
>     Node* local_ctrl = not_pos, *local_io = *io;
>     MergeMemNode* local_mem = MergeMemNode::make(mem);
>     transform_later(local_mem);
>
>     // (6) length must not be negative.
>     if (!length_never_negative) {
>       generate_negative_guard(&local_ctrl, copy_length, slow_region);
>     }
>
> I think the only way to avoid this is to modify code in generate_arraycopy()
> when EliminateAllocations is true. In such case you need to generate only
> length == 0 check.
>
> Thanks,
> Vladimir
>
>
> On 8/25/16 10:03 AM, Volker Simonis wrote:
>>
>> On Tue, Aug 16, 2016 at 11:49 PM, Vladimir Kozlov
>> <vladimir.kozlov at oracle.com> wrote:
>>>
>>> Not generating exception is definitely bug.
>>>
>>> First, about test case. It would be nice if it also verifies other
>>> IndexOutOfBoundsException cases.
>>>
>>
>> I've extended the test case. See:
>>
>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8159611.v2/
>>
>> With the new test I've caught another problem in C1 (only on x86 and
>> s390, but that's not in the OpenJDK yet :).
>>
>> LIR_Assembler::emit_arraycopy() had a shortcut for length==0 which
>> prevented the throwing of an ArrayStoreException if src and dst arrays
>> have incompatible type (see do_test2() in the new regression test).
>> Note that this is a different error from 8160591 and not fixed by the
>> change for 8160591.
>>
>> I've also moved the new check after the offset + length check as
>> suggested by you (see new webrev).
>>
>> Unfortunately, the new check is still not eliminated. Here's how it looks:
>>
>> 0ae   B6: #    B20 B7 <- B5  Freq: 0,999997
>> 0ae       movl    R9, [rsp + #0]    # spill
>> 0b2       testl   R9, R9
>> 0b5       jl     B20  P=0,000001 C=-1,000000
>> 0b5
>> 0bb   B7: #    B12 B8 <- B6  Freq: 0,999996
>> 0bb       movl    R11, [R10 + #8 (8-bit)]    # compressed klass ptr
>> 0bf       decode_klass_not_null RAX,R11
>> 0cc       movl    RBX, [RAX + #16 (8-bit)]    # int
>> 0cf       movslq  RCX, RBX    # i2l
>> 0d2       movq    RSI, precise klass [Ljava/lang/Object;:
>> 0x00007ff1080320d0:Constant:exact *    # ptr
>> 0dc       movq    RCX, [RSI + RCX]    # class
>> 0e0       cmpq    RAX, RCX    # ptr
>> 0e3       jne,us  B12  P=0,170000 C=-1,000000
>> 0e3
>> 0e5   B8: #    B21 B9 <- B7 B13 B14  Freq: 0,999996
>> 0e5       testl   R9, R9
>> 0e8       jle     B21  P=0,000001 C=-1,000000
>>
>> As you can see 'testl   R9, R9' is executed two times.
>>
>> I've even tried to move the new check after the subtype check, but
>> that doesn't helps either:
>>
>> 0da   B7: #    B20 B8 <- B6 B13 B14  Freq: 0,999997
>> 0da       movl    R11, [rsp + #8]    # spill
>> 0df       testl   R11, R11
>> 0e2       jl     B20  P=0,000001 C=-1,000000
>> 0e2
>> 0e8   B8: #    B10 B9 <- B7  Freq: 0,999996
>> 0e8       testl   R11, R11
>> 0eb       jle,s   B10  P=0,000001 C=-1,000000
>>
>> Any idea how this could be fixed?
>>
>> Thanks,
>> Volker
>>
>> PS: and I still don't have a reproducible benchmark which shows a
>> regression with my change...
>>
>>
>>> Actually additional dynamic check will help in case of negative length is
>>> know during compilation. The allocation code will be eliminated very
>>> early
>>> instead of waiting macro expansion:
>>>
>>>       int length = alloc->in(AllocateNode::ALength)->find_int_con(-1);
>>>       if (length < 0) {
>>>         NOT_PRODUCT(fail_eliminate = "Array's size is not constant";)
>>>         can_eliminate = false;
>>>       }
>>>
>>> About additional length check in your new test. I think it may be
>>> collapsed
>>> with preceding check since it is generated after other checks.
>>> So I would suggest to move it after offset + length check.
>>>
>>> Thanks,
>>> Vladimir
>>>
>>>
>>> On 8/16/16 7:57 AM, Volker Simonis wrote:
>>>>
>>>>
>>>> On Tue, Aug 16, 2016 at 7:24 AM, Tobias Hartmann
>>>> <tobias.hartmann at oracle.com> wrote:
>>>>>
>>>>>
>>>>> Hi Volker,
>>>>>
>>>>> thanks for taking care of this issue!
>>>>>
>>>>> Did you check what happens if the allocation is not eliminated and
>>>>> macro
>>>>> expansion phase emits another negative guard? Are the checks merged?
>>>>>
>>>>
>>>> It depends. I just saw that in some cases the regression test worked
>>>> before, because the length check was done in
>>>> SharedRuntime::slow_arraycopy_C(). So in that case there's obviously
>>>> nothing that can be merged. But the test case is obviously a
>>>> degenerated example anyway, so I don't think that's a problem.
>>>>
>>>> If I do a more real-world example like this where the arracopy can not
>>>> be eliminated because one of its arguments escapes:
>>>>
>>>>   public static boolean do_test2(int length, Object[] dest) {
>>>>     try {
>>>>       System.arraycopy(new Object[10], 1, dest, 1, length);
>>>>       return false;
>>>>     } catch (IndexOutOfBoundsException e) {
>>>>       return true;
>>>>     }
>>>>   }
>>>>
>>>> and call it with:
>>>>
>>>> do_test2(8, new Object[10])
>>>>
>>>> the generated code for do_test2() unfortunately contains one more
>>>> check now with my change (the 'length' field is in [rsp + #0]):
>>>>
>>>> 0a2   B4: #    B18 B5 <- B3  Freq: 0,999999
>>>> 0a2       movl    R9, [rsp + #0]    # spill
>>>> 0a6       testl   R9, R9
>>>> 0a9       jl     B18  P=0,000001 C=-1,000000
>>>> 0a9
>>>> 0af   B5: #    B18 B6 <- B4  Freq: 0,999998
>>>> 0af       movl    RBX, R9    # spill
>>>> 0b2       incl    RBX    # int
>>>> 0b4       cmpl    RBX, #10    # unsigned
>>>> 0b7       jnbe,u  B18  P=0,000001 C=-1,000000
>>>>
>>>> The generated code before my change looked like this (againthe
>>>> 'length' field is in [rsp + #0]):
>>>>
>>>> 0a1   B4: #    B17 B5 <- B3  Freq: 0,999999
>>>> 0a1       movl    R11, [rsp + #8]    # spill
>>>> 0a6       incl    R11    # int
>>>> 0a9       cmpl    R11, #10    # unsigned
>>>> 0ad       jnbe,u  B17  P=0,000001 C=-1,000000
>>>>
>>>> It seems that the 'length' check has been completely eliminated before.
>>>>
>>>> So I need to do some more tests to understand why the new check isn't
>>>> eliminated.
>>>>
>>>> Do you think the new check results in a performance regression? Have
>>>> you run some benchmarks?
>>>>
>>>>> I would prefer brackets around the if body but you don't need to send
>>>>> another webrev:
>>>>>  if (EliminateAllocations) {
>>>>>    generate_negative_guard(length, slow_region);
>>>>>  }
>>>>
>>>>
>>>>
>>>> Yes, I agree.
>>>>
>>>>>
>>>>> Best regards,
>>>>> Tobias
>>>>>
>>>>> On 12.08.2016 21:13, Volker Simonis wrote:
>>>>>>
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> can I please have a review and sponsor for the following fix:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8159611
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8159611
>>>>>>
>>>>>>
>>>>>> We are inserting several checks for the arguments of
>>>>>> System.arraycopy() in LibraryCallKit::inline_arraycopy() before
>>>>>> intensifying the call in LibraryCallKit::inline_arraycopy. However the
>>>>>> check for the 'length' argument of arracopy is postponed to the macro
>>>>>> expansion phase in PhaseMacroExpand::generate_arraycopy().
>>>>>>
>>>>>> But if we are running with EscapeAnalysis and EliminateAllocations,
>>>>>> the array allocations inside a call to System.arraycopy() may get
>>>>>> eliminated and thus the complete call to System.arraycopy() will be
>>>>>> removed (see PhaseMacroExpand::process_users_of_allocation). In this
>>>>>> case the extra 'length' check won't be added by
>>>>>> PhaseMacroExpand::generate_arraycopy() any more because macro
>>>>>> expansion happens after the elimination of macro nodes.
>>>>>>
>>>>>> In such a case it may happen that System.arraycopy() will silently
>>>>>> accept an invalid (i.e. negative) 'length' parameter, although it
>>>>>> should actually throw an ArrayOutOfBounds exception.
>>>>>>
>>>>>> The fix is simple: also insert a check for the length field in
>>>>>> LibraryCallKit::inline_arraycopy() if we are running with
>>>>>> EliminateAllocations.
>>>>>>
>>>>>> Regards,
>>>>>> Volker
>>>>>>
>>>
>

From cthalinger at twitter.com  Mon Sep 12 17:26:02 2016
From: cthalinger at twitter.com (Christian Thalinger)
Date: Mon, 12 Sep 2016 07:26:02 -1000
Subject: RFR: 8165457: [JVMCI] increase InterpreterCodeSize for JVMCI
In-Reply-To: <92B9E4F8-DF56-475B-A9EC-6FB179C58925@twitter.com>
References: <39E38A4A-7DEB-49C3-BC8B-C41C9F0F0ED1@oracle.com>
	<A493D25F-BA16-43B4-8719-56419ADB294F@twitter.com>
	<7ED300F2-253B-4550-BF5E-878A99EDAEB2@oracle.com>
	<92B9E4F8-DF56-475B-A9EC-6FB179C58925@twitter.com>
Message-ID: <938FBEA5-0AAB-4640-B231-E259B00275AB@twitter.com>


> On Sep 6, 2016, at 11:58 AM, Christian Thalinger <cthalinger at twitter.com> wrote:
> 
>> 
>> On Sep 6, 2016, at 11:37 AM, Doug Simon <doug.simon at oracle.com> wrote:
>> 
>> 
>>> On 06 Sep 2016, at 20:14, Christian Thalinger <cthalinger at twitter.com> wrote:
>>> 
>>> 
>>>> On Sep 5, 2016, at 6:49 AM, Doug Simon <doug.simon at oracle.com> wrote:
>>>> 
>>>> In jvmci-8, we increased the interpreter code size when JVMCI code is included: 
>>>> 
>>>> http://hg.openjdk.java.net/graal/graal-jvmci-8/file/a074ae16281d/src/cpu/x86/vm/templateInterpreter_x86.hpp#l37 
>>> 
>>> What about SPARC?  Have we ever seen a problem there?  Or AArch64 for that matter?
>> 
>> I?ve only ever seen problems on AMD64. I?ve never seen it on SPARC and have never run on AArch64.
>> 
>> The real fix is that the interpreter generator should never have to guess the size of the code buffer it needs but should resize things as needed after generating the interpreter.
> 
> Yes, it should.

Forgot to say that it looks good.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160912/757160b8/attachment.html>

From vitalyd at gmail.com  Mon Sep 12 19:13:27 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Mon, 12 Sep 2016 15:13:27 -0400
Subject: Odd interaction between ArrayList$Itr and Escape Analysis
Message-ID: <CAHjP37HcvD9JkhwrAjkNpV+vawyc1A=QgYD51SMLrLnJ90J4rw@mail.gmail.com>

Hi all,

Vladimir I. and I have been looking at a peculiarity in EA as it relates to
eliminating the ArrayList$Itr.  What Vladimir found (and I see it as well)
is that ArrayList$Itr::init isn't always inlined due to "unloaded signature
classes", e.g.:

@ 6   java.util.ArrayList::iterator (10 bytes)   inline (hot)
                              @ 6   java.util.ArrayList$Itr::<init> (6
bytes)   unloaded signature classes

I tried to dig a bit further into this, and it appears that what's
"unloaded" is ArrayList$1.  LogCompilation shows this (which I think is
relevant):
<bc code='183' bci='6'/>
<type id='709' name='void'/>
<klass id='827' name='java/util/ArrayList$1' unloaded='1'/>
<klass id='821' name='java/util/ArrayList$Itr' flags='2'/>
<method id='828' holder='821' name='&lt;init&gt;' return='709'
arguments='820 827' flags='4096' bytes='6' iicount='1853'/>
<call method='828' count='-1' prof_factor='0.602806' inline='1'/>
<inline_fail reason='unloaded signature classes'/>
<direct_call bci='6'/>
<parse_done nodes='100' live='98' memory='35824' stamp='1.114'/>
</parse>

It looks like ArrayList$1 is a synthetic class generated by javac because
ArrayList$Itr constructor is private (despite the class itself being
private).  Here's the bytecode (8u51) of ArrayList::iterator:

public java.util.Iterator<E> iterator();
    descriptor: ()Ljava/util/Iterator;
    flags: ACC_PUBLIC
    Code:
      stack=4, locals=1, args_size=1
         0: new           #61                 // class
java/util/ArrayList$Itr
         3: dup
         4: aload_0
         5: aconst_null
         6: invokespecial #62                 // Method
java/util/ArrayList$Itr."<init>":(Ljava/util/ArrayList;Ljava/util/ArrayList$1;)V
         9: areturn
      LineNumberTable:
        line 834: 0
    Signature: #185                         // ()Ljava/util/Iterator<TE;>;

The only way I can get the Itr allocation removed in my method is by
causing some other method that does the same thing to be JIT compiled prior
to mine.

Does anyone have a good idea of what's actually going on here? Why is that
synthetic ArrayList$1 such a pest here? It's a bit sad that such a little
thing can prevent EA from working in a perfectly good candidate method for
it.

Thoughts?

Thanks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160912/7d1dab7d/attachment.html>

From rednaxelafx at gmail.com  Mon Sep 12 19:19:24 2016
From: rednaxelafx at gmail.com (Krystal Mok)
Date: Mon, 12 Sep 2016 12:19:24 -0700
Subject: Odd interaction between ArrayList$Itr and Escape Analysis
In-Reply-To: <CAHjP37HcvD9JkhwrAjkNpV+vawyc1A=QgYD51SMLrLnJ90J4rw@mail.gmail.com>
References: <CAHjP37HcvD9JkhwrAjkNpV+vawyc1A=QgYD51SMLrLnJ90J4rw@mail.gmail.com>
Message-ID: <CA+cQ+tQEeRDKsn0VffLfhXcbUqYd_vehtK9sEussoOkmTux4cA@mail.gmail.com>

Hi Vitaly,

Haha. I've actually fixed the exact same problem in Zing JVM when I found
this out a while ago. Do you guys want the patch be upstreamed?

Here's the bug description that I wrote for Zing, but it applies to HotSpot
as well (since we inherited that bug from HotSpot):

This bug is to track an enhancement that would allow compilation and
inlining of "bridge constructors" for private inner classes, generated by
javac.

In HotSpot's compilation policy, and C2's inlining heuristic, if a
method/constructor is found to have unloaded classes in its signature, then
there are special handling:
 * in compilation policy, if a method is about to be triggered a C2
compilation, and there are unloaded classes in its signature, then these
classes are forced to be loaded before compilation;
 * in C2, when a method is considered to be a candidate for inlining, if
there are unloaded classes in its signature, it will NOT be inlined.

It's questionable whether or not the C2 inlining heuristic is profitable in
general, but there's a case where it's definitely not profitable - when
dealing with "bridge constructors" generated by javac.

When javac sees a private inner class with no explicit constructors, e.g.

> package java.util;
>
> public class ArrayList<E> implements Iterable<E> {
>   public Iterator<E> iterator() {
>     return new Itr();
>   }
>
>   private class Itr implements Iterator<E> { }
> }

javac will synthesize two constructors for the inner class (e.g. Itr above):
1. The normal default constructor, with accessibility the same as its
holder class - private
  private java.util.ArrayList$Itr(java.util.ArrayList);
2. A "bridge constructor". Because the enclosing class needs to access
Itr's constructor, but doesn't have accessibility to the private one, so
javac futher synthesizes this "bridge constructor" with package
accessibility, which simply delegates to the private default one:
  java.util.ArrayList$Itr(java.util.ArrayList, java.util.ArrayList$1);

The sole purpose of the "bridge constructor" is to provide accessibility,
but if it were only different from the private one in its accessibility,
the two constructors won't be distinguishable under JVM's overload
resolution rules. So, javac pulls a trick, and appends a marker argument
called "access constructor tag" to the argument list of the bridge
constructor, e.g. java.util.ArrayList$1 in this example, and always passes
a null to this argument.

In effect, the class of this marker argument never needs to be loaded,
because it's never instantiated. But C2 isn't happy about unloaded classes
in signature, so it'd refuse to inline any bridge constructors.

0.320:   17       2 TestC2ArrayListIteratorLoop::sumList
0.321:              @ 3   java.util.ArrayList::iterator (10 bytes)
inlined (hot)
0.321:              - @ 6   java.util.ArrayList$Itr::<init> (6 bytes)
unloaded signature classes
0.321:              @ 8   java.util.ArrayList$Itr::hasNext (20 bytes)
inlined (hot)
0.321:                @ 8   java.util.ArrayList::access$100 (5 bytes)
inlined (hot)
0.322:              @ 25   java.lang.Integer::intValue (5 bytes)   inlined
(hot)

With this enhancement, C2 will be able to ignore the unloaded class in the
bridge constructor, and inline it:

0.269:   18       2 TestC2ArrayListIteratorLoop::sumList
0.269:              @ 3   java.util.ArrayList::iterator (10 bytes)
inlined (hot)
0.270:                @ 6   java.util.ArrayList$Itr::<init> (6 bytes)
inlined (hot)
0.270:                  @ 2   java.util.ArrayList$Itr::<init> (26 bytes)
inlined (hot)
0.270:                  - @ 6   java.lang.Object::<init> (1 bytes)   don't
intrinsify this
0.270:                    @ 6   java.lang.Object::<init> (1 bytes)
inlined (hot)
0.270:              @ 8   java.util.ArrayList$Itr::hasNext (20 bytes)
inlined (hot)
0.270:                @ 8   java.util.ArrayList::access$100 (5 bytes)
inlined (hot)
0.271:              @ 25   java.lang.Integer::intValue (5 bytes)   inlined
(hot)

- Kris

On Mon, Sep 12, 2016 at 12:13 PM, Vitaly Davidovich <vitalyd at gmail.com>
wrote:

> Hi all,
>
> Vladimir I. and I have been looking at a peculiarity in EA as it relates
> to eliminating the ArrayList$Itr.  What Vladimir found (and I see it as
> well) is that ArrayList$Itr::init isn't always inlined due to "unloaded
> signature classes", e.g.:
>
> @ 6   java.util.ArrayList::iterator (10 bytes)   inline (hot)
>                               @ 6   java.util.ArrayList$Itr::<init> (6
> bytes)   unloaded signature classes
>
> I tried to dig a bit further into this, and it appears that what's
> "unloaded" is ArrayList$1.  LogCompilation shows this (which I think is
> relevant):
> <bc code='183' bci='6'/>
> <type id='709' name='void'/>
> <klass id='827' name='java/util/ArrayList$1' unloaded='1'/>
> <klass id='821' name='java/util/ArrayList$Itr' flags='2'/>
> <method id='828' holder='821' name='&lt;init&gt;' return='709'
> arguments='820 827' flags='4096' bytes='6' iicount='1853'/>
> <call method='828' count='-1' prof_factor='0.602806' inline='1'/>
> <inline_fail reason='unloaded signature classes'/>
> <direct_call bci='6'/>
> <parse_done nodes='100' live='98' memory='35824' stamp='1.114'/>
> </parse>
>
> It looks like ArrayList$1 is a synthetic class generated by javac because
> ArrayList$Itr constructor is private (despite the class itself being
> private).  Here's the bytecode (8u51) of ArrayList::iterator:
>
> public java.util.Iterator<E> iterator();
>     descriptor: ()Ljava/util/Iterator;
>     flags: ACC_PUBLIC
>     Code:
>       stack=4, locals=1, args_size=1
>          0: new           #61                 // class
> java/util/ArrayList$Itr
>          3: dup
>          4: aload_0
>          5: aconst_null
>          6: invokespecial #62                 // Method
> java/util/ArrayList$Itr."<init>":(Ljava/util/ArrayList;
> Ljava/util/ArrayList$1;)V
>          9: areturn
>       LineNumberTable:
>         line 834: 0
>     Signature: #185                         // ()Ljava/util/Iterator<TE;>;
>
> The only way I can get the Itr allocation removed in my method is by
> causing some other method that does the same thing to be JIT compiled prior
> to mine.
>
> Does anyone have a good idea of what's actually going on here? Why is that
> synthetic ArrayList$1 such a pest here? It's a bit sad that such a little
> thing can prevent EA from working in a perfectly good candidate method for
> it.
>
> Thoughts?
>
> Thanks
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160912/5eb7a985/attachment-0001.html>

From vitalyd at gmail.com  Mon Sep 12 19:38:14 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Mon, 12 Sep 2016 15:38:14 -0400
Subject: Odd interaction between ArrayList$Itr and Escape Analysis
In-Reply-To: <CA+cQ+tQEeRDKsn0VffLfhXcbUqYd_vehtK9sEussoOkmTux4cA@mail.gmail.com>
References: <CAHjP37HcvD9JkhwrAjkNpV+vawyc1A=QgYD51SMLrLnJ90J4rw@mail.gmail.com>
	<CA+cQ+tQEeRDKsn0VffLfhXcbUqYd_vehtK9sEussoOkmTux4cA@mail.gmail.com>
Message-ID: <CAHjP37HKRe9xhhkaWG--t8s=m7ZDvDu57Gt0qCEzi8_Q9K7dTA@mail.gmail.com>

Hi Kris,

On Mon, Sep 12, 2016 at 3:19 PM, Krystal Mok <rednaxelafx at gmail.com> wrote:

> Hi Vitaly,
>
> Haha. I've actually fixed the exact same problem in Zing JVM when I found
> this out a while ago. Do you guys want the patch be upstreamed?
>
Incredible - nothing like debugging/troubleshooting an already solved
problem! :( What other goodies could you upstream? :)

Vladimir I. (or K. :)), could you guys accept that patch?

>
> Here's the bug description that I wrote for Zing, but it applies to
> HotSpot as well (since we inherited that bug from HotSpot):
>
> This bug is to track an enhancement that would allow compilation and
> inlining of "bridge constructors" for private inner classes, generated by
> javac.
>
> In HotSpot's compilation policy, and C2's inlining heuristic, if a
> method/constructor is found to have unloaded classes in its signature, then
> there are special handling:
>  * in compilation policy, if a method is about to be triggered a C2
> compilation, and there are unloaded classes in its signature, then these
> classes are forced to be loaded before compilation;
>
This explains why I actually had to trigger a dummy method to JIT compile
so that bridge class would be loaded.  I was slightly puzzled why simply
exercising that iteration code in the interpreter wasn't "loading" the
unloaded class(es).

>  * in C2, when a method is considered to be a candidate for inlining, if
> there are unloaded classes in its signature, it will NOT be inlined.
>

> It's questionable whether or not the C2 inlining heuristic is profitable
> in general, but there's a case where it's definitely not profitable - when
> dealing with "bridge constructors" generated by javac.
>
It seems odd to me as well why inlining won't force load the missing
class(es).  If we're inlining, it means the method itself or the call chain
it's part of is hot - failing to inline can have negative side-effects,
like this example.  I suppose there must be a good reason why it doesn't do
this though?

>
> When javac sees a private inner class with no explicit constructors, e.g.
>
> > package java.util;
> >
> > public class ArrayList<E> implements Iterable<E> {
> >   public Iterator<E> iterator() {
> >     return new Itr();
> >   }
> >
> >   private class Itr implements Iterator<E> { }
> > }
>
> javac will synthesize two constructors for the inner class (e.g. Itr
> above):
> 1. The normal default constructor, with accessibility the same as its
> holder class - private
>   private java.util.ArrayList$Itr(java.util.ArrayList);
> 2. A "bridge constructor". Because the enclosing class needs to access
> Itr's constructor, but doesn't have accessibility to the private one, so
> javac futher synthesizes this "bridge constructor" with package
> accessibility, which simply delegates to the private default one:
>   java.util.ArrayList$Itr(java.util.ArrayList, java.util.ArrayList$1);
>
> The sole purpose of the "bridge constructor" is to provide accessibility,
> but if it were only different from the private one in its accessibility,
> the two constructors won't be distinguishable under JVM's overload
> resolution rules. So, javac pulls a trick, and appends a marker argument
> called "access constructor tag" to the argument list of the bridge
> constructor, e.g. java.util.ArrayList$1 in this example, and always passes
> a null to this argument.
>
Aha, so that's why there's that aconst_null right before the invokespecial!
I was wondering what the heck that was.

>
> In effect, the class of this marker argument never needs to be loaded,
> because it's never instantiated. But C2 isn't happy about unloaded classes
> in signature, so it'd refuse to inline any bridge constructors.
>
> 0.320:   17       2 TestC2ArrayListIteratorLoop::sumList
> 0.321:              @ 3   java.util.ArrayList::iterator (10 bytes)
> inlined (hot)
> 0.321:              - @ 6   java.util.ArrayList$Itr::<init> (6 bytes)
> unloaded signature classes
> 0.321:              @ 8   java.util.ArrayList$Itr::hasNext (20 bytes)
> inlined (hot)
> 0.321:                @ 8   java.util.ArrayList::access$100 (5 bytes)
> inlined (hot)
> 0.322:              @ 25   java.lang.Integer::intValue (5 bytes)   inlined
> (hot)
>
> With this enhancement, C2 will be able to ignore the unloaded class in the
> bridge constructor, and inline it:
>
> 0.269:   18       2 TestC2ArrayListIteratorLoop::sumList
> 0.269:              @ 3   java.util.ArrayList::iterator (10 bytes)
> inlined (hot)
> 0.270:                @ 6   java.util.ArrayList$Itr::<init> (6 bytes)
> inlined (hot)
> 0.270:                  @ 2   java.util.ArrayList$Itr::<init> (26 bytes)
>   inlined (hot)
> 0.270:                  - @ 6   java.lang.Object::<init> (1 bytes)   don't
> intrinsify this
> 0.270:                    @ 6   java.lang.Object::<init> (1 bytes)
> inlined (hot)
> 0.270:              @ 8   java.util.ArrayList$Itr::hasNext (20 bytes)
> inlined (hot)
> 0.270:                @ 8   java.util.ArrayList::access$100 (5 bytes)
> inlined (hot)
> 0.271:              @ 25   java.lang.Integer::intValue (5 bytes)   inlined
> (hot)
>
> - Kris
>
Thanks for the great explanation Kris.

>
> On Mon, Sep 12, 2016 at 12:13 PM, Vitaly Davidovich <vitalyd at gmail.com>
> wrote:
>
>> Hi all,
>>
>> Vladimir I. and I have been looking at a peculiarity in EA as it relates
>> to eliminating the ArrayList$Itr.  What Vladimir found (and I see it as
>> well) is that ArrayList$Itr::init isn't always inlined due to "unloaded
>> signature classes", e.g.:
>>
>> @ 6   java.util.ArrayList::iterator (10 bytes)   inline (hot)
>>                               @ 6   java.util.ArrayList$Itr::<init> (6
>> bytes)   unloaded signature classes
>>
>> I tried to dig a bit further into this, and it appears that what's
>> "unloaded" is ArrayList$1.  LogCompilation shows this (which I think is
>> relevant):
>> <bc code='183' bci='6'/>
>> <type id='709' name='void'/>
>> <klass id='827' name='java/util/ArrayList$1' unloaded='1'/>
>> <klass id='821' name='java/util/ArrayList$Itr' flags='2'/>
>> <method id='828' holder='821' name='&lt;init&gt;' return='709'
>> arguments='820 827' flags='4096' bytes='6' iicount='1853'/>
>> <call method='828' count='-1' prof_factor='0.602806' inline='1'/>
>> <inline_fail reason='unloaded signature classes'/>
>> <direct_call bci='6'/>
>> <parse_done nodes='100' live='98' memory='35824' stamp='1.114'/>
>> </parse>
>>
>> It looks like ArrayList$1 is a synthetic class generated by javac because
>> ArrayList$Itr constructor is private (despite the class itself being
>> private).  Here's the bytecode (8u51) of ArrayList::iterator:
>>
>> public java.util.Iterator<E> iterator();
>>     descriptor: ()Ljava/util/Iterator;
>>     flags: ACC_PUBLIC
>>     Code:
>>       stack=4, locals=1, args_size=1
>>          0: new           #61                 // class
>> java/util/ArrayList$Itr
>>          3: dup
>>          4: aload_0
>>          5: aconst_null
>>          6: invokespecial #62                 // Method
>> java/util/ArrayList$Itr."<init>":(Ljava/util/ArrayList;Ljava
>> /util/ArrayList$1;)V
>>          9: areturn
>>       LineNumberTable:
>>         line 834: 0
>>     Signature: #185                         // ()Ljava/util/Iterator<TE;>;
>>
>> The only way I can get the Itr allocation removed in my method is by
>> causing some other method that does the same thing to be JIT compiled prior
>> to mine.
>>
>> Does anyone have a good idea of what's actually going on here? Why is
>> that synthetic ArrayList$1 such a pest here? It's a bit sad that such a
>> little thing can prevent EA from working in a perfectly good candidate
>> method for it.
>>
>> Thoughts?
>>
>> Thanks
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160912/d1005bd9/attachment.html>

From matcdac at gmail.com  Mon Sep 12 19:53:09 2016
From: matcdac at gmail.com (Prakhar Makhija)
Date: Tue, 13 Sep 2016 01:23:09 +0530
Subject: Incomplete Iterator
Message-ID: <CAPu805rYA493DRU3idYYcrzP9=Lg0R2RW4iA0HjN8XxuYPPNCA@mail.gmail.com>

Hi,

I feel there should be one more Iterator added to the Collections, let's
say UpdatedIterator, which should be implemented for List, Set, Map, etc.

The reason being the existing one does not support the manipulation of
Collection and throws Exception.

It would be great to have a new one besides this, so programmers will have
a choice to pick any of the two.

Let's say more than one thread are accessing their own static collection
field, or some encapsulated data of another object.
Now some will of these threads will want to just iterate it, others may
remove something from it, while others may add in it, also some can
manipulate the existing data.
So the basic need is UpdatedIterator must keep track of the latest updated
modified version of the Collection, using the same.
On Sep 13, 2016 12:51 AM, <hotspot-compiler-dev-request at openjdk.java.net>
wrote:

Send hotspot-compiler-dev mailing list submissions to
        hotspot-compiler-dev at openjdk.java.net

To subscribe or unsubscribe via the World Wide Web, visit
        http://mail.openjdk.java.net/mailman/listinfo/hotspot-compiler-dev
or, via email, send a message with subject or body 'help' to
        hotspot-compiler-dev-request at openjdk.java.net

You can reach the person managing the list at
        hotspot-compiler-dev-owner at openjdk.java.net

When replying, please edit your Subject line so it is more specific
than "Re: Contents of hotspot-compiler-dev digest..."


Today's Topics:

   1. Re: RFR: 8165457: [JVMCI] increase InterpreterCodeSize for
      JVMCI (Christian Thalinger)
   2. Odd interaction between ArrayList$Itr and Escape Analysis
      (Vitaly Davidovich)
   3. Re: Odd interaction between ArrayList$Itr and Escape Analysis
      (Krystal Mok)


----------------------------------------------------------------------

Message: 1
Date: Mon, 12 Sep 2016 07:26:02 -1000
From: Christian Thalinger <cthalinger at twitter.com>
To: Doug Simon <doug.simon at oracle.com>
Cc: hotspot compiler <hotspot-compiler-dev at openjdk.java.net>
Subject: Re: RFR: 8165457: [JVMCI] increase InterpreterCodeSize for
        JVMCI
Message-ID: <938FBEA5-0AAB-4640-B231-E259B00275AB at twitter.com>
Content-Type: text/plain; charset="utf-8"


> On Sep 6, 2016, at 11:58 AM, Christian Thalinger <cthalinger at twitter.com>
wrote:
>
>>
>> On Sep 6, 2016, at 11:37 AM, Doug Simon <doug.simon at oracle.com> wrote:
>>
>>
>>> On 06 Sep 2016, at 20:14, Christian Thalinger <cthalinger at twitter.com>
wrote:
>>>
>>>
>>>> On Sep 5, 2016, at 6:49 AM, Doug Simon <doug.simon at oracle.com> wrote:
>>>>
>>>> In jvmci-8, we increased the interpreter code size when JVMCI code is
included:
>>>>
>>>> http://hg.openjdk.java.net/graal/graal-jvmci-8/file/
a074ae16281d/src/cpu/x86/vm/templateInterpreter_x86.hpp#l37
>>>
>>> What about SPARC?  Have we ever seen a problem there?  Or AArch64 for
that matter?
>>
>> I?ve only ever seen problems on AMD64. I?ve never seen it on SPARC and
have never run on AArch64.
>>
>> The real fix is that the interpreter generator should never have to
guess the size of the code buffer it needs but should resize things as
needed after generating the interpreter.
>
> Yes, it should.

Forgot to say that it looks good.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-
dev/attachments/20160912/757160b8/attachment-0001.html>

------------------------------

Message: 2
Date: Mon, 12 Sep 2016 15:13:27 -0400
From: Vitaly Davidovich <vitalyd at gmail.com>
To: hotspot compiler <hotspot-compiler-dev at openjdk.java.net>
Subject: Odd interaction between ArrayList$Itr and Escape Analysis
Message-ID:
        <CAHjP37HcvD9JkhwrAjkNpV+vawyc1A=QgYD51SMLrLnJ90J4rw at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi all,

Vladimir I. and I have been looking at a peculiarity in EA as it relates to
eliminating the ArrayList$Itr.  What Vladimir found (and I see it as well)
is that ArrayList$Itr::init isn't always inlined due to "unloaded signature
classes", e.g.:

@ 6   java.util.ArrayList::iterator (10 bytes)   inline (hot)
                              @ 6   java.util.ArrayList$Itr::<init> (6
bytes)   unloaded signature classes

I tried to dig a bit further into this, and it appears that what's
"unloaded" is ArrayList$1.  LogCompilation shows this (which I think is
relevant):
<bc code='183' bci='6'/>
<type id='709' name='void'/>
<klass id='827' name='java/util/ArrayList$1' unloaded='1'/>
<klass id='821' name='java/util/ArrayList$Itr' flags='2'/>
<method id='828' holder='821' name='&lt;init&gt;' return='709'
arguments='820 827' flags='4096' bytes='6' iicount='1853'/>
<call method='828' count='-1' prof_factor='0.602806' inline='1'/>
<inline_fail reason='unloaded signature classes'/>
<direct_call bci='6'/>
<parse_done nodes='100' live='98' memory='35824' stamp='1.114'/>
</parse>

It looks like ArrayList$1 is a synthetic class generated by javac because
ArrayList$Itr constructor is private (despite the class itself being
private).  Here's the bytecode (8u51) of ArrayList::iterator:

public java.util.Iterator<E> iterator();
    descriptor: ()Ljava/util/Iterator;
    flags: ACC_PUBLIC
    Code:
      stack=4, locals=1, args_size=1
         0: new           #61                 // class
java/util/ArrayList$Itr
         3: dup
         4: aload_0
         5: aconst_null
         6: invokespecial #62                 // Method
java/util/ArrayList$Itr."<init>":(Ljava/util/ArrayList;
Ljava/util/ArrayList$1;)V
         9: areturn
      LineNumberTable:
        line 834: 0
    Signature: #185                         // ()Ljava/util/Iterator<TE;>;

The only way I can get the Itr allocation removed in my method is by
causing some other method that does the same thing to be JIT compiled prior
to mine.

Does anyone have a good idea of what's actually going on here? Why is that
synthetic ArrayList$1 such a pest here? It's a bit sad that such a little
thing can prevent EA from working in a perfectly good candidate method for
it.

Thoughts?

Thanks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-
dev/attachments/20160912/7d1dab7d/attachment-0001.html>

------------------------------

Message: 3
Date: Mon, 12 Sep 2016 12:19:24 -0700
From: Krystal Mok <rednaxelafx at gmail.com>
To: Vitaly Davidovich <vitalyd at gmail.com>
Cc: hotspot compiler <hotspot-compiler-dev at openjdk.java.net>
Subject: Re: Odd interaction between ArrayList$Itr and Escape Analysis
Message-ID:
        <CA+cQ+tQEeRDKsn0VffLfhXcbUqYd_vehtK9sEussoOkmTux4cA at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi Vitaly,

Haha. I've actually fixed the exact same problem in Zing JVM when I found
this out a while ago. Do you guys want the patch be upstreamed?

Here's the bug description that I wrote for Zing, but it applies to HotSpot
as well (since we inherited that bug from HotSpot):

This bug is to track an enhancement that would allow compilation and
inlining of "bridge constructors" for private inner classes, generated by
javac.

In HotSpot's compilation policy, and C2's inlining heuristic, if a
method/constructor is found to have unloaded classes in its signature, then
there are special handling:
 * in compilation policy, if a method is about to be triggered a C2
compilation, and there are unloaded classes in its signature, then these
classes are forced to be loaded before compilation;
 * in C2, when a method is considered to be a candidate for inlining, if
there are unloaded classes in its signature, it will NOT be inlined.

It's questionable whether or not the C2 inlining heuristic is profitable in
general, but there's a case where it's definitely not profitable - when
dealing with "bridge constructors" generated by javac.

When javac sees a private inner class with no explicit constructors, e.g.

> package java.util;
>
> public class ArrayList<E> implements Iterable<E> {
>   public Iterator<E> iterator() {
>     return new Itr();
>   }
>
>   private class Itr implements Iterator<E> { }
> }

javac will synthesize two constructors for the inner class (e.g. Itr above):
1. The normal default constructor, with accessibility the same as its
holder class - private
  private java.util.ArrayList$Itr(java.util.ArrayList);
2. A "bridge constructor". Because the enclosing class needs to access
Itr's constructor, but doesn't have accessibility to the private one, so
javac futher synthesizes this "bridge constructor" with package
accessibility, which simply delegates to the private default one:
  java.util.ArrayList$Itr(java.util.ArrayList, java.util.ArrayList$1);

The sole purpose of the "bridge constructor" is to provide accessibility,
but if it were only different from the private one in its accessibility,
the two constructors won't be distinguishable under JVM's overload
resolution rules. So, javac pulls a trick, and appends a marker argument
called "access constructor tag" to the argument list of the bridge
constructor, e.g. java.util.ArrayList$1 in this example, and always passes
a null to this argument.

In effect, the class of this marker argument never needs to be loaded,
because it's never instantiated. But C2 isn't happy about unloaded classes
in signature, so it'd refuse to inline any bridge constructors.

0.320:   17       2 TestC2ArrayListIteratorLoop::sumList
0.321:              @ 3   java.util.ArrayList::iterator (10 bytes)
inlined (hot)
0.321:              - @ 6   java.util.ArrayList$Itr::<init> (6 bytes)
unloaded signature classes
0.321:              @ 8   java.util.ArrayList$Itr::hasNext (20 bytes)
inlined (hot)
0.321:                @ 8   java.util.ArrayList::access$100 (5 bytes)
inlined (hot)
0.322:              @ 25   java.lang.Integer::intValue (5 bytes)   inlined
(hot)

With this enhancement, C2 will be able to ignore the unloaded class in the
bridge constructor, and inline it:

0.269:   18       2 TestC2ArrayListIteratorLoop::sumList
0.269:              @ 3   java.util.ArrayList::iterator (10 bytes)
inlined (hot)
0.270:                @ 6   java.util.ArrayList$Itr::<init> (6 bytes)
inlined (hot)
0.270:                  @ 2   java.util.ArrayList$Itr::<init> (26 bytes)
inlined (hot)
0.270:                  - @ 6   java.lang.Object::<init> (1 bytes)   don't
intrinsify this
0.270:                    @ 6   java.lang.Object::<init> (1 bytes)
inlined (hot)
0.270:              @ 8   java.util.ArrayList$Itr::hasNext (20 bytes)
inlined (hot)
0.270:                @ 8   java.util.ArrayList::access$100 (5 bytes)
inlined (hot)
0.271:              @ 25   java.lang.Integer::intValue (5 bytes)   inlined
(hot)

- Kris

On Mon, Sep 12, 2016 at 12:13 PM, Vitaly Davidovich <vitalyd at gmail.com>
wrote:

> Hi all,
>
> Vladimir I. and I have been looking at a peculiarity in EA as it relates
> to eliminating the ArrayList$Itr.  What Vladimir found (and I see it as
> well) is that ArrayList$Itr::init isn't always inlined due to "unloaded
> signature classes", e.g.:
>
> @ 6   java.util.ArrayList::iterator (10 bytes)   inline (hot)
>                               @ 6   java.util.ArrayList$Itr::<init> (6
> bytes)   unloaded signature classes
>
> I tried to dig a bit further into this, and it appears that what's
> "unloaded" is ArrayList$1.  LogCompilation shows this (which I think is
> relevant):
> <bc code='183' bci='6'/>
> <type id='709' name='void'/>
> <klass id='827' name='java/util/ArrayList$1' unloaded='1'/>
> <klass id='821' name='java/util/ArrayList$Itr' flags='2'/>
> <method id='828' holder='821' name='&lt;init&gt;' return='709'
> arguments='820 827' flags='4096' bytes='6' iicount='1853'/>
> <call method='828' count='-1' prof_factor='0.602806' inline='1'/>
> <inline_fail reason='unloaded signature classes'/>
> <direct_call bci='6'/>
> <parse_done nodes='100' live='98' memory='35824' stamp='1.114'/>
> </parse>
>
> It looks like ArrayList$1 is a synthetic class generated by javac because
> ArrayList$Itr constructor is private (despite the class itself being
> private).  Here's the bytecode (8u51) of ArrayList::iterator:
>
> public java.util.Iterator<E> iterator();
>     descriptor: ()Ljava/util/Iterator;
>     flags: ACC_PUBLIC
>     Code:
>       stack=4, locals=1, args_size=1
>          0: new           #61                 // class
> java/util/ArrayList$Itr
>          3: dup
>          4: aload_0
>          5: aconst_null
>          6: invokespecial #62                 // Method
> java/util/ArrayList$Itr."<init>":(Ljava/util/ArrayList;
> Ljava/util/ArrayList$1;)V
>          9: areturn
>       LineNumberTable:
>         line 834: 0
>     Signature: #185                         // ()Ljava/util/Iterator<TE;>;
>
> The only way I can get the Itr allocation removed in my method is by
> causing some other method that does the same thing to be JIT compiled
prior
> to mine.
>
> Does anyone have a good idea of what's actually going on here? Why is that
> synthetic ArrayList$1 such a pest here? It's a bit sad that such a little
> thing can prevent EA from working in a perfectly good candidate method for
> it.
>
> Thoughts?
>
> Thanks
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-
dev/attachments/20160912/5eb7a985/attachment.html>

End of hotspot-compiler-dev Digest, Vol 112, Issue 20
*****************************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160913/592722dd/attachment-0001.html>

From rednaxelafx at gmail.com  Mon Sep 12 19:56:52 2016
From: rednaxelafx at gmail.com (Krystal Mok)
Date: Mon, 12 Sep 2016 12:56:52 -0700
Subject: Odd interaction between ArrayList$Itr and Escape Analysis
In-Reply-To: <CAHjP37HKRe9xhhkaWG--t8s=m7ZDvDu57Gt0qCEzi8_Q9K7dTA@mail.gmail.com>
References: <CAHjP37HcvD9JkhwrAjkNpV+vawyc1A=QgYD51SMLrLnJ90J4rw@mail.gmail.com>
	<CA+cQ+tQEeRDKsn0VffLfhXcbUqYd_vehtK9sEussoOkmTux4cA@mail.gmail.com>
	<CAHjP37HKRe9xhhkaWG--t8s=m7ZDvDu57Gt0qCEzi8_Q9K7dTA@mail.gmail.com>
Message-ID: <CA+cQ+tT4Rx4B_Wr=D0YvmNG4YozsiJRC-tgasv_yTv-M==kYnA@mail.gmail.com>

On Mon, Sep 12, 2016 at 12:38 PM, Vitaly Davidovich <vitalyd at gmail.com>
wrote:
>
> It seems odd to me as well why inlining won't force load the missing
> class(es).  If we're inlining, it means the method itself or the call chain
> it's part of is hot - failing to inline can have negative side-effects,
> like this example.  I suppose there must be a good reason why it doesn't do
> this though?
>

That's because we can't. The JIT compilers are running on their own
threads, and they're not real "Java threads". So they are not allowed to
run arbitrary Java code. But Java class loading may involve running
arbitrary Java code, e.g. the ClassLoader.loadClass() upcall.
Force class loading can be done on the triggering side (for the top-level
method), because compilation tasks are triggered from real Java threads,
and they're allowed to run arbitrary Java code.

- Kris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160912/eddee078/attachment.html>

From vitalyd at gmail.com  Mon Sep 12 20:15:41 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Mon, 12 Sep 2016 16:15:41 -0400
Subject: Odd interaction between ArrayList$Itr and Escape Analysis
In-Reply-To: <CA+cQ+tT4Rx4B_Wr=D0YvmNG4YozsiJRC-tgasv_yTv-M==kYnA@mail.gmail.com>
References: <CAHjP37HcvD9JkhwrAjkNpV+vawyc1A=QgYD51SMLrLnJ90J4rw@mail.gmail.com>
	<CA+cQ+tQEeRDKsn0VffLfhXcbUqYd_vehtK9sEussoOkmTux4cA@mail.gmail.com>
	<CAHjP37HKRe9xhhkaWG--t8s=m7ZDvDu57Gt0qCEzi8_Q9K7dTA@mail.gmail.com>
	<CA+cQ+tT4Rx4B_Wr=D0YvmNG4YozsiJRC-tgasv_yTv-M==kYnA@mail.gmail.com>
Message-ID: <CAHjP37GJQ02mMOT-TtFsoOhRJcjdd08CKp3WxpjYBnYri=7bdw@mail.gmail.com>

On Mon, Sep 12, 2016 at 3:56 PM, Krystal Mok <rednaxelafx at gmail.com> wrote:

> On Mon, Sep 12, 2016 at 12:38 PM, Vitaly Davidovich <vitalyd at gmail.com>
> wrote:
>>
>> It seems odd to me as well why inlining won't force load the missing
>> class(es).  If we're inlining, it means the method itself or the call chain
>> it's part of is hot - failing to inline can have negative side-effects,
>> like this example.  I suppose there must be a good reason why it doesn't do
>> this though?
>>
>
> That's because we can't. The JIT compilers are running on their own
> threads, and they're not real "Java threads". So they are not allowed to
> run arbitrary Java code. But Java class loading may involve running
> arbitrary Java code, e.g. the ClassLoader.loadClass() upcall.
> Force class loading can be done on the triggering side (for the top-level
> method), because compilation tasks are triggered from real Java threads,
> and they're allowed to run arbitrary Java code.
>
I see, makes sense.  Perhaps there can be an option to turn on loading of
required types in the entire compilation unit, after all inlining is done
(and therefore make the unloaded types not be barriers for inlining).  I'd
personally prefer that over having odd performance differences.

>
> - Kris
>
Thanks Kris.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160912/60a71cac/attachment.html>

From cheremin at gmail.com  Tue Sep 13 09:07:48 2016
From: cheremin at gmail.com (Cheremin Ruslan)
Date: Tue, 13 Sep 2016 12:07:48 +0300
Subject: MaxBCEAEstimateSize and inlining clarification
Message-ID: <00C16B65-A85F-491E-9384-1172735D9952@gmail.com>

> I'm seeing some code that iterates over a ConcurrentHashMap's entrySet that allocates tens of GB of CHM$MapEntry objects even though they don't escape


I'm a bit confused: I was sure BCEA-style params do affect EA, but don't affect scalar replacement. With bcEscapeAnalyser you can get (sort of) inter-procedural EA, but this only allows you to have more allocations identified as ArgEscape instead of GlobalEscape. But you can't get more NoEscape without real inlining. ArgEscape (afaik) is used only for synchronization removals in HotSpot, not for scalar replacements. 

Am I incorrect?

----
Ruslan

From martin.doerr at sap.com  Tue Sep 13 09:35:09 2016
From: martin.doerr at sap.com (Doerr, Martin)
Date: Tue, 13 Sep 2016 09:35:09 +0000
Subject: RFR(m) 8164920: ppc: enhancement of CRC32 intrinsic
In-Reply-To: <OF3DBB7B4B.62D23BD5-ON49258026.0031B271-49258026.0051807D@notes.na.collabserv.com>
References: <OF3DBB7B4B.62D23BD5-ON49258026.0031B271-49258026.0051807D@notes.na.collabserv.com>
Message-ID: <fdca33bc870f47e89768ef6cd96aba7f@DEWDFE13DE14.global.corp.sap>

Hi Hiroshi,

we appreciate your change. Thanks for contributing it.
It basically looks good, but I'd like to propose some minor improvements.


kernel_crc32_1word_vpmsumd:

1. The Pre-align code can be implemented shorter:
    clrldi_(prealign, buf, 57);
    beq(CCR0, L_alignHead);

    subfic(prealign, prealign, 128);

2. I'd prefer the label name "L_alignedHead".

3. The branch b(L_alignTail) and the label are not needed and should get removed.


kernel_crc32_1word_aligned:

1. When saving and restoring non-volatile vector register, please use offset differences of -16 instead of -32.
(The ABI allows up to 288 bytes to be used in frameless functions so it will fit if -16 is used.)

2. The std instructions should better be used with int offsets so you can get rid of the addi(offset, offset, -8) instructions.


Comments:
For single line comments "//" should be used instead of "/*". Would be nice if you could change them.


Thanks and best regards,
Martin


From: Hiroshi H Horii [mailto:HORII at jp.ibm.com]
Sent: Dienstag, 6. September 2016 16:50
To: hotspot-compiler-dev at openjdk.java.net; vladimir.kozlov at oracle.com
Cc: Volker Simonis (volker.simonis at gmail.com) <volker.simonis at gmail.com>; Doerr, Martin <martin.doerr at sap.com>; Gustavo Bueno Romero <gromero at br.ibm.com>
Subject: RFR(m) 8164920: ppc: enhancement of CRC32 intrinsic

Dear Vladimir and all:

Can I please request reviews for the following change?

JIRA: https://bugs.openjdk.java.net/browse/JDK-8164920
webrev: http://cr.openjdk.java.net/~gromero/8164920/01/

As Volker's comments in the above JIRA, this is a ppc64-only improvement which will not
affect any of the Oracle platforms in any way.

This change includes new implementation of CRC32 Intrinsics for ppc64le.
In my local experiment, CRC32 of 64KB was calculated more than 20 times faster than original.
Performance of CRC32 Intrinsic is important to run recent Apache Cassandra.
A Cassandra daemon needs to read 64KB data from a disk with CRC32 checksum by default.

This JIRA entry has "jdk9-fc-request" label.
If there is a chance to include new change in JDK 9 for ppc64le, I would like to request
a review for this change.

Regards,
Hiroshi
-----------------------
Hiroshi Horii, Ph.D.
IBM Research - Tokyo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160913/1e071122/attachment-0001.html>

From forax at univ-mlv.fr  Tue Sep 13 14:18:29 2016
From: forax at univ-mlv.fr (Remi Forax)
Date: Tue, 13 Sep 2016 16:18:29 +0200 (CEST)
Subject: Odd interaction between ArrayList$Itr and Escape Analysis
In-Reply-To: <CAHjP37GJQ02mMOT-TtFsoOhRJcjdd08CKp3WxpjYBnYri=7bdw@mail.gmail.com>
References: <CAHjP37HcvD9JkhwrAjkNpV+vawyc1A=QgYD51SMLrLnJ90J4rw@mail.gmail.com>
	<CA+cQ+tQEeRDKsn0VffLfhXcbUqYd_vehtK9sEussoOkmTux4cA@mail.gmail.com>
	<CAHjP37HKRe9xhhkaWG--t8s=m7ZDvDu57Gt0qCEzi8_Q9K7dTA@mail.gmail.com>
	<CA+cQ+tT4Rx4B_Wr=D0YvmNG4YozsiJRC-tgasv_yTv-M==kYnA@mail.gmail.com>
	<CAHjP37GJQ02mMOT-TtFsoOhRJcjdd08CKp3WxpjYBnYri=7bdw@mail.gmail.com>
Message-ID: <1619527975.952230.1473776309365.JavaMail.zimbra@u-pem.fr>

I've always found that the empty inner classes generated by javac as a kind of hack. 

These classes should be removed in Java 10, thanks to the nestmate attributes. 
http://mail.openjdk.java.net/pipermail/valhalla-spec-experts/2016-January/000060.html 

The other solution, is to have an empty class in the jdk which is not visible from javac (the class itself can be marked as synthetic), 
so javac can use it without creating method clash. 

and to solve the problem now, the easy solution is to add a package private constructor in ArrayList.Itr, 

private class Itr implements Iterator<E> { int cursor;       // index of next element to return int lastRet = -1; // index of last element returned; -1 if no such int expectedModCount = modCount; 

Itr() { 
// avoid to generate a synthetic accessor constructor 
} 
} 

regards, 
R?mi 

> De: "Vitaly Davidovich" <vitalyd at gmail.com>
> ?: "Krystal Mok" <rednaxelafx at gmail.com>
> Cc: "hotspot compiler" <hotspot-compiler-dev at openjdk.java.net>
> Envoy?: Lundi 12 Septembre 2016 22:15:41
> Objet: Re: Odd interaction between ArrayList$Itr and Escape Analysis

> On Mon, Sep 12, 2016 at 3:56 PM, Krystal Mok < rednaxelafx at gmail.com > wrote:

>> On Mon, Sep 12, 2016 at 12:38 PM, Vitaly Davidovich < vitalyd at gmail.com > wrote:
>>> It seems odd to me as well why inlining won't force load the missing class(es).
>>> If we're inlining, it means the method itself or the call chain it's part of is
>>> hot - failing to inline can have negative side-effects, like this example. I
>>> suppose there must be a good reason why it doesn't do this though?

>> That's because we can't. The JIT compilers are running on their own threads, and
>> they're not real "Java threads". So they are not allowed to run arbitrary Java
>> code. But Java class loading may involve running arbitrary Java code, e.g. the
>> ClassLoader.loadClass() upcall.
>> Force class loading can be done on the triggering side (for the top-level
>> method), because compilation tasks are triggered from real Java threads, and
>> they're allowed to run arbitrary Java code.

> I see, makes sense. Perhaps there can be an option to turn on loading of
> required types in the entire compilation unit, after all inlining is done (and
> therefore make the unloaded types not be barriers for inlining). I'd personally
> prefer that over having odd performance differences.

>> - Kris

> Thanks Kris.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160913/7fa2c66e/attachment.html>

From zoltan.majo at oracle.com  Tue Sep 13 15:04:47 2016
From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=)
Date: Tue, 13 Sep 2016 17:04:47 +0200
Subject: RFR(S): 8159611: C2: ArrayCopy elimination skips required
	parameter checks
In-Reply-To: <CA+3eh11qwNSwAdtxq1QyqRFcHpLsLSnKFSRH_5Vx8Zi8u5Ci_A@mail.gmail.com>
References: <CA+3eh10fWqYQbL-eKEk=ud26pR5C1hM6Z_S0=ROJe-TnipKcNQ@mail.gmail.com>
	<57B2A380.6000408@oracle.com>
	<CA+3eh10yxdunPPYRptfTghA6LKRc5iO+VU=EsFr2PJgQmexQag@mail.gmail.com>
	<41851a79-5ffe-2b9d-504a-6a2301de5384@oracle.com>
	<CA+3eh12Jmcnt83YZ3cWaX1+JdSvY2pTNaLC=uQSLmK-6oGMZHA@mail.gmail.com>
	<7ce01d28-13f5-098a-9898-080f8258881d@oracle.com>
	<CA+3eh11qwNSwAdtxq1QyqRFcHpLsLSnKFSRH_5Vx8Zi8u5Ci_A@mail.gmail.com>
Message-ID: <8e399624-8e67-ebe6-d348-7691690532e8@oracle.com>

Hi Volker,


On 09/12/2016 06:35 PM, Volker Simonis wrote:
> Sorry for the long delay...

thank you for spending more time on this bug and also for the detailed 
description of the way your solution works!

>
> Here's my new version:
>
> http://cr.openjdk.java.net/~simonis/webrevs/2016/8159611.v3/

That looks good to me.

I did a preliminary performance evaluation with Octane-Gbemu and 
Octane-PdfJS, results look good on all platforms.  Let me now do a more 
detailed evaluation.  I'll get back to you once the results are available.

Thank you!

Best regards,


Zoltan

>
> I've actually changed PhaseMacroExpand::expand_arraycopy_node() such
> that it calls generate_arraycopy() with 'length_never_negative' set to
> true if EliminateAllocations is true (in this case we already checked
> in LibraryCallKit::inline_arraycopy() that 'length' is not negative).
> This way I could leave generate_arraycopy() untouched.
>
> The generated code now looks as follows:
>
> Original version (without 'length < 0' check):
>
> 0a7   B5: #    B17 B6 <- B4  Freq: 0,999998
> 0a7       cmpl    R9, R11    # unsigned
> 0aa       jb,u  B17  P=0,000001 C=-1,000000
> ...
> 0da   B7: #    B18 B8 <- B6 B12 B13  Freq: 0,999997
> 0da       movl    R11, [rsp + #8]    # spill
> 0df       testl   R11, R11
> 0e2       jle     B18  P=0,000001 C=-1,000000
> ...
> 0e8   B8: #    B9 <- B7  Freq: 0,999996
> 0f9       call_leaf_nofp,runtime  oop_disjoint_arraycopy
> ...
> 106   B9: #    B10 <- B8 B18 B20  Freq: 0,999997
> 113       ret
> ...
> 184   B17: #    N1 <- B4 B5  Freq: 2,01328e-06
> 193       call,static  wrapper for:
> uncommon_trap(reason='intrinsic_or_type_checked_inlining'
> action='make_not_entrant' debug_id='0')
>
> 19d   B18: #    B9 B19 <- B7  Freq: 9,99997e-07
> 19d       testl   R11, R11
> 1a0       jge     B9  P=0,999999 C=-1,000000
> 1a0
> 1a6   B19: #    B22 B20 <- B18  Freq: 9,99997e-13
> 1a6       movq    RSI, R8    # spill
> 1a9       movl    RDX, #1    # int
> 1ae       movq    RCX, R10    # spill
> 1b1       movl    R8, #1    # int
> 1b7       movl    R9, R11    # spill
>            nop     # 1 bytes pad for loops and calls
> 1bb       call,static  wrapper for: slow_arraycopy
>
> In B5 there's a check if 'offset+length' is still in the array range.
> If not we jump to the uncommon trap in B17.
> In B7 there's the first check from
> PhaseMacroExpand::generate_arraycopy() (i.e.
> generate_nonpositive_guard()). If 'length is less than or equal to
> zero we jump to B18 where there's the second check from
> PhaseMacroExpand::generate_arraycopy() (i.e.
> generate_negative_guard()). If 'length' is  zero, we jump to B9 and
> return. Otherwise we fall into B19 from where we call slow_arraycopy.
> slow_arraycopy (which is generated in ObjArrayKlass::copy_array() will
> throw an AIOOB exception if 'length' is negative.
>
> The new version now looks as follows:
>
> 0a2   B5: #    B19 B6 <- B4  Freq: 0,999998
> 0a2       cmpl    R10, RCX    # unsigned
> 0a5       jb,u  B19  P=0,000001 C=-1,000000
> 0a5
> 0ab   B6: #    B20 B7 <- B5  Freq: 0,999997
> 0ab       movl    R10, [rsp + #0]    # spill
> 0af       testl   R10, R10
> 0b2       jl     B20  P=0,000001 C=-1,000000
> 0b2
> ...
> 0e2   B8: #    B10 B9 <- B7 B13 B14  Freq: 0,999996
> 0e2       testl   R10, R10
> 0e5       je,s   B10  P=0,000001 C=-1,000000
> ...
> 0e7   B9: #    B10 <- B8  Freq: 0,999995
> 0f8       call_leaf_nofp,runtime  oop_disjoint_arraycopy
> ...
> 105   B10: #    B11 <- B9 B8  Freq: 0,999996
> 112       ret
> ...
> 18e   B19: #    B20 <- B5  Freq: 9,99998e-07
> 192   B20: #    N1 <- B18 B19 B6  Freq: 3,01327e-06
> 1a3       call,static  wrapper for:
> uncommon_trap(reason='intrinsic_or_type_checked_inlining'
> action='make_not_entrant' debug_id='0')
>
> B5 is like before, but is now followed by the extra check for 'length'
> being not negative in B6. In B8 we we now have the first check (i.e.
> generate_negative_guard()) from
> PhaseMacroExpand::generate_arraycopy(). It directly checks if 'length'
> is zero and jumps to B10 (i.e. returns) if so. Otherwise we fall
> directly into oop_disjoint_arraycopy(). There's no need to check for
> 'length' being negative and calling 'slow_arraycopy' because this case
> is already handled before now (in B6).
>
> Is this OK now?
>
> Thank you and best regards,
> Volker
>
>
> On Fri, Aug 26, 2016 at 3:51 AM, Vladimir Kozlov
> <vladimir.kozlov at oracle.com> wrote:
>> Looks good.
>>
>> Check does not fold because it is different: LT vs LE.
>>
>> Actually there are 3 checks together with yours (see
>> PhaseMacroExpand::generate_arraycopy()):
>>
>>    Node* not_pos = generate_nonpositive_guard(ctrl, copy_length,
>> length_never_negative);
>>    if (not_pos != NULL) {
>>      Node* local_ctrl = not_pos, *local_io = *io;
>>      MergeMemNode* local_mem = MergeMemNode::make(mem);
>>      transform_later(local_mem);
>>
>>      // (6) length must not be negative.
>>      if (!length_never_negative) {
>>        generate_negative_guard(&local_ctrl, copy_length, slow_region);
>>      }
>>
>> I think the only way to avoid this is to modify code in generate_arraycopy()
>> when EliminateAllocations is true. In such case you need to generate only
>> length == 0 check.
>>
>> Thanks,
>> Vladimir
>>
>>
>> On 8/25/16 10:03 AM, Volker Simonis wrote:
>>> On Tue, Aug 16, 2016 at 11:49 PM, Vladimir Kozlov
>>> <vladimir.kozlov at oracle.com> wrote:
>>>> Not generating exception is definitely bug.
>>>>
>>>> First, about test case. It would be nice if it also verifies other
>>>> IndexOutOfBoundsException cases.
>>>>
>>> I've extended the test case. See:
>>>
>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8159611.v2/
>>>
>>> With the new test I've caught another problem in C1 (only on x86 and
>>> s390, but that's not in the OpenJDK yet :).
>>>
>>> LIR_Assembler::emit_arraycopy() had a shortcut for length==0 which
>>> prevented the throwing of an ArrayStoreException if src and dst arrays
>>> have incompatible type (see do_test2() in the new regression test).
>>> Note that this is a different error from 8160591 and not fixed by the
>>> change for 8160591.
>>>
>>> I've also moved the new check after the offset + length check as
>>> suggested by you (see new webrev).
>>>
>>> Unfortunately, the new check is still not eliminated. Here's how it looks:
>>>
>>> 0ae   B6: #    B20 B7 <- B5  Freq: 0,999997
>>> 0ae       movl    R9, [rsp + #0]    # spill
>>> 0b2       testl   R9, R9
>>> 0b5       jl     B20  P=0,000001 C=-1,000000
>>> 0b5
>>> 0bb   B7: #    B12 B8 <- B6  Freq: 0,999996
>>> 0bb       movl    R11, [R10 + #8 (8-bit)]    # compressed klass ptr
>>> 0bf       decode_klass_not_null RAX,R11
>>> 0cc       movl    RBX, [RAX + #16 (8-bit)]    # int
>>> 0cf       movslq  RCX, RBX    # i2l
>>> 0d2       movq    RSI, precise klass [Ljava/lang/Object;:
>>> 0x00007ff1080320d0:Constant:exact *    # ptr
>>> 0dc       movq    RCX, [RSI + RCX]    # class
>>> 0e0       cmpq    RAX, RCX    # ptr
>>> 0e3       jne,us  B12  P=0,170000 C=-1,000000
>>> 0e3
>>> 0e5   B8: #    B21 B9 <- B7 B13 B14  Freq: 0,999996
>>> 0e5       testl   R9, R9
>>> 0e8       jle     B21  P=0,000001 C=-1,000000
>>>
>>> As you can see 'testl   R9, R9' is executed two times.
>>>
>>> I've even tried to move the new check after the subtype check, but
>>> that doesn't helps either:
>>>
>>> 0da   B7: #    B20 B8 <- B6 B13 B14  Freq: 0,999997
>>> 0da       movl    R11, [rsp + #8]    # spill
>>> 0df       testl   R11, R11
>>> 0e2       jl     B20  P=0,000001 C=-1,000000
>>> 0e2
>>> 0e8   B8: #    B10 B9 <- B7  Freq: 0,999996
>>> 0e8       testl   R11, R11
>>> 0eb       jle,s   B10  P=0,000001 C=-1,000000
>>>
>>> Any idea how this could be fixed?
>>>
>>> Thanks,
>>> Volker
>>>
>>> PS: and I still don't have a reproducible benchmark which shows a
>>> regression with my change...
>>>
>>>
>>>> Actually additional dynamic check will help in case of negative length is
>>>> know during compilation. The allocation code will be eliminated very
>>>> early
>>>> instead of waiting macro expansion:
>>>>
>>>>        int length = alloc->in(AllocateNode::ALength)->find_int_con(-1);
>>>>        if (length < 0) {
>>>>          NOT_PRODUCT(fail_eliminate = "Array's size is not constant";)
>>>>          can_eliminate = false;
>>>>        }
>>>>
>>>> About additional length check in your new test. I think it may be
>>>> collapsed
>>>> with preceding check since it is generated after other checks.
>>>> So I would suggest to move it after offset + length check.
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>>
>>>> On 8/16/16 7:57 AM, Volker Simonis wrote:
>>>>>
>>>>> On Tue, Aug 16, 2016 at 7:24 AM, Tobias Hartmann
>>>>> <tobias.hartmann at oracle.com> wrote:
>>>>>>
>>>>>> Hi Volker,
>>>>>>
>>>>>> thanks for taking care of this issue!
>>>>>>
>>>>>> Did you check what happens if the allocation is not eliminated and
>>>>>> macro
>>>>>> expansion phase emits another negative guard? Are the checks merged?
>>>>>>
>>>>> It depends. I just saw that in some cases the regression test worked
>>>>> before, because the length check was done in
>>>>> SharedRuntime::slow_arraycopy_C(). So in that case there's obviously
>>>>> nothing that can be merged. But the test case is obviously a
>>>>> degenerated example anyway, so I don't think that's a problem.
>>>>>
>>>>> If I do a more real-world example like this where the arracopy can not
>>>>> be eliminated because one of its arguments escapes:
>>>>>
>>>>>    public static boolean do_test2(int length, Object[] dest) {
>>>>>      try {
>>>>>        System.arraycopy(new Object[10], 1, dest, 1, length);
>>>>>        return false;
>>>>>      } catch (IndexOutOfBoundsException e) {
>>>>>        return true;
>>>>>      }
>>>>>    }
>>>>>
>>>>> and call it with:
>>>>>
>>>>> do_test2(8, new Object[10])
>>>>>
>>>>> the generated code for do_test2() unfortunately contains one more
>>>>> check now with my change (the 'length' field is in [rsp + #0]):
>>>>>
>>>>> 0a2   B4: #    B18 B5 <- B3  Freq: 0,999999
>>>>> 0a2       movl    R9, [rsp + #0]    # spill
>>>>> 0a6       testl   R9, R9
>>>>> 0a9       jl     B18  P=0,000001 C=-1,000000
>>>>> 0a9
>>>>> 0af   B5: #    B18 B6 <- B4  Freq: 0,999998
>>>>> 0af       movl    RBX, R9    # spill
>>>>> 0b2       incl    RBX    # int
>>>>> 0b4       cmpl    RBX, #10    # unsigned
>>>>> 0b7       jnbe,u  B18  P=0,000001 C=-1,000000
>>>>>
>>>>> The generated code before my change looked like this (againthe
>>>>> 'length' field is in [rsp + #0]):
>>>>>
>>>>> 0a1   B4: #    B17 B5 <- B3  Freq: 0,999999
>>>>> 0a1       movl    R11, [rsp + #8]    # spill
>>>>> 0a6       incl    R11    # int
>>>>> 0a9       cmpl    R11, #10    # unsigned
>>>>> 0ad       jnbe,u  B17  P=0,000001 C=-1,000000
>>>>>
>>>>> It seems that the 'length' check has been completely eliminated before.
>>>>>
>>>>> So I need to do some more tests to understand why the new check isn't
>>>>> eliminated.
>>>>>
>>>>> Do you think the new check results in a performance regression? Have
>>>>> you run some benchmarks?
>>>>>
>>>>>> I would prefer brackets around the if body but you don't need to send
>>>>>> another webrev:
>>>>>>   if (EliminateAllocations) {
>>>>>>     generate_negative_guard(length, slow_region);
>>>>>>   }
>>>>>
>>>>>
>>>>> Yes, I agree.
>>>>>
>>>>>> Best regards,
>>>>>> Tobias
>>>>>>
>>>>>> On 12.08.2016 21:13, Volker Simonis wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> can I please have a review and sponsor for the following fix:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8159611
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8159611
>>>>>>>
>>>>>>>
>>>>>>> We are inserting several checks for the arguments of
>>>>>>> System.arraycopy() in LibraryCallKit::inline_arraycopy() before
>>>>>>> intensifying the call in LibraryCallKit::inline_arraycopy. However the
>>>>>>> check for the 'length' argument of arracopy is postponed to the macro
>>>>>>> expansion phase in PhaseMacroExpand::generate_arraycopy().
>>>>>>>
>>>>>>> But if we are running with EscapeAnalysis and EliminateAllocations,
>>>>>>> the array allocations inside a call to System.arraycopy() may get
>>>>>>> eliminated and thus the complete call to System.arraycopy() will be
>>>>>>> removed (see PhaseMacroExpand::process_users_of_allocation). In this
>>>>>>> case the extra 'length' check won't be added by
>>>>>>> PhaseMacroExpand::generate_arraycopy() any more because macro
>>>>>>> expansion happens after the elimination of macro nodes.
>>>>>>>
>>>>>>> In such a case it may happen that System.arraycopy() will silently
>>>>>>> accept an invalid (i.e. negative) 'length' parameter, although it
>>>>>>> should actually throw an ArrayOutOfBounds exception.
>>>>>>>
>>>>>>> The fix is simple: also insert a check for the length field in
>>>>>>> LibraryCallKit::inline_arraycopy() if we are running with
>>>>>>> EliminateAllocations.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Volker
>>>>>>>


From vladimir.kozlov at oracle.com  Tue Sep 13 16:32:46 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 13 Sep 2016 09:32:46 -0700
Subject: RFR(S): 8159611: C2: ArrayCopy elimination skips required
	parameter checks
In-Reply-To: <CA+3eh11qwNSwAdtxq1QyqRFcHpLsLSnKFSRH_5Vx8Zi8u5Ci_A@mail.gmail.com>
References: <CA+3eh10fWqYQbL-eKEk=ud26pR5C1hM6Z_S0=ROJe-TnipKcNQ@mail.gmail.com>
	<57B2A380.6000408@oracle.com>
	<CA+3eh10yxdunPPYRptfTghA6LKRc5iO+VU=EsFr2PJgQmexQag@mail.gmail.com>
	<41851a79-5ffe-2b9d-504a-6a2301de5384@oracle.com>
	<CA+3eh12Jmcnt83YZ3cWaX1+JdSvY2pTNaLC=uQSLmK-6oGMZHA@mail.gmail.com>
	<7ce01d28-13f5-098a-9898-080f8258881d@oracle.com>
	<CA+3eh11qwNSwAdtxq1QyqRFcHpLsLSnKFSRH_5Vx8Zi8u5Ci_A@mail.gmail.com>
Message-ID: <57D82A2E.7020902@oracle.com>

Yes, I agree with generate_negative_guard() in inline_arraycopy().

But I think we should path flag to ArrayCopyNode::make() when negative guards is generated in inline_arraycopy().
It is generated under several conditions so I don't want it to be missed in expand_arraycopy_node().

Thanks,
Vladimir

On 9/12/16 9:35 AM, Volker Simonis wrote:
> Sorry for the long delay...
>
> Here's my new version:
>
> http://cr.openjdk.java.net/~simonis/webrevs/2016/8159611.v3/
>
> I've actually changed PhaseMacroExpand::expand_arraycopy_node() such
> that it calls generate_arraycopy() with 'length_never_negative' set to
> true if EliminateAllocations is true (in this case we already checked
> in LibraryCallKit::inline_arraycopy() that 'length' is not negative).
> This way I could leave generate_arraycopy() untouched.
>
> The generated code now looks as follows:
>
> Original version (without 'length < 0' check):
>
> 0a7   B5: #    B17 B6 <- B4  Freq: 0,999998
> 0a7       cmpl    R9, R11    # unsigned
> 0aa       jb,u  B17  P=0,000001 C=-1,000000
> ...
> 0da   B7: #    B18 B8 <- B6 B12 B13  Freq: 0,999997
> 0da       movl    R11, [rsp + #8]    # spill
> 0df       testl   R11, R11
> 0e2       jle     B18  P=0,000001 C=-1,000000
> ...
> 0e8   B8: #    B9 <- B7  Freq: 0,999996
> 0f9       call_leaf_nofp,runtime  oop_disjoint_arraycopy
> ...
> 106   B9: #    B10 <- B8 B18 B20  Freq: 0,999997
> 113       ret
> ...
> 184   B17: #    N1 <- B4 B5  Freq: 2,01328e-06
> 193       call,static  wrapper for:
> uncommon_trap(reason='intrinsic_or_type_checked_inlining'
> action='make_not_entrant' debug_id='0')
>
> 19d   B18: #    B9 B19 <- B7  Freq: 9,99997e-07
> 19d       testl   R11, R11
> 1a0       jge     B9  P=0,999999 C=-1,000000
> 1a0
> 1a6   B19: #    B22 B20 <- B18  Freq: 9,99997e-13
> 1a6       movq    RSI, R8    # spill
> 1a9       movl    RDX, #1    # int
> 1ae       movq    RCX, R10    # spill
> 1b1       movl    R8, #1    # int
> 1b7       movl    R9, R11    # spill
>            nop     # 1 bytes pad for loops and calls
> 1bb       call,static  wrapper for: slow_arraycopy
>
> In B5 there's a check if 'offset+length' is still in the array range.
> If not we jump to the uncommon trap in B17.
> In B7 there's the first check from
> PhaseMacroExpand::generate_arraycopy() (i.e.
> generate_nonpositive_guard()). If 'length is less than or equal to
> zero we jump to B18 where there's the second check from
> PhaseMacroExpand::generate_arraycopy() (i.e.
> generate_negative_guard()). If 'length' is  zero, we jump to B9 and
> return. Otherwise we fall into B19 from where we call slow_arraycopy.
> slow_arraycopy (which is generated in ObjArrayKlass::copy_array() will
> throw an AIOOB exception if 'length' is negative.
>
> The new version now looks as follows:
>
> 0a2   B5: #    B19 B6 <- B4  Freq: 0,999998
> 0a2       cmpl    R10, RCX    # unsigned
> 0a5       jb,u  B19  P=0,000001 C=-1,000000
> 0a5
> 0ab   B6: #    B20 B7 <- B5  Freq: 0,999997
> 0ab       movl    R10, [rsp + #0]    # spill
> 0af       testl   R10, R10
> 0b2       jl     B20  P=0,000001 C=-1,000000
> 0b2
> ...
> 0e2   B8: #    B10 B9 <- B7 B13 B14  Freq: 0,999996
> 0e2       testl   R10, R10
> 0e5       je,s   B10  P=0,000001 C=-1,000000
> ...
> 0e7   B9: #    B10 <- B8  Freq: 0,999995
> 0f8       call_leaf_nofp,runtime  oop_disjoint_arraycopy
> ...
> 105   B10: #    B11 <- B9 B8  Freq: 0,999996
> 112       ret
> ...
> 18e   B19: #    B20 <- B5  Freq: 9,99998e-07
> 192   B20: #    N1 <- B18 B19 B6  Freq: 3,01327e-06
> 1a3       call,static  wrapper for:
> uncommon_trap(reason='intrinsic_or_type_checked_inlining'
> action='make_not_entrant' debug_id='0')
>
> B5 is like before, but is now followed by the extra check for 'length'
> being not negative in B6. In B8 we we now have the first check (i.e.
> generate_negative_guard()) from
> PhaseMacroExpand::generate_arraycopy(). It directly checks if 'length'
> is zero and jumps to B10 (i.e. returns) if so. Otherwise we fall
> directly into oop_disjoint_arraycopy(). There's no need to check for
> 'length' being negative and calling 'slow_arraycopy' because this case
> is already handled before now (in B6).
>
> Is this OK now?
>
> Thank you and best regards,
> Volker
>
>
> On Fri, Aug 26, 2016 at 3:51 AM, Vladimir Kozlov
> <vladimir.kozlov at oracle.com> wrote:
>> Looks good.
>>
>> Check does not fold because it is different: LT vs LE.
>>
>> Actually there are 3 checks together with yours (see
>> PhaseMacroExpand::generate_arraycopy()):
>>
>>    Node* not_pos = generate_nonpositive_guard(ctrl, copy_length,
>> length_never_negative);
>>    if (not_pos != NULL) {
>>      Node* local_ctrl = not_pos, *local_io = *io;
>>      MergeMemNode* local_mem = MergeMemNode::make(mem);
>>      transform_later(local_mem);
>>
>>      // (6) length must not be negative.
>>      if (!length_never_negative) {
>>        generate_negative_guard(&local_ctrl, copy_length, slow_region);
>>      }
>>
>> I think the only way to avoid this is to modify code in generate_arraycopy()
>> when EliminateAllocations is true. In such case you need to generate only
>> length == 0 check.
>>
>> Thanks,
>> Vladimir
>>
>>
>> On 8/25/16 10:03 AM, Volker Simonis wrote:
>>>
>>> On Tue, Aug 16, 2016 at 11:49 PM, Vladimir Kozlov
>>> <vladimir.kozlov at oracle.com> wrote:
>>>>
>>>> Not generating exception is definitely bug.
>>>>
>>>> First, about test case. It would be nice if it also verifies other
>>>> IndexOutOfBoundsException cases.
>>>>
>>>
>>> I've extended the test case. See:
>>>
>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8159611.v2/
>>>
>>> With the new test I've caught another problem in C1 (only on x86 and
>>> s390, but that's not in the OpenJDK yet :).
>>>
>>> LIR_Assembler::emit_arraycopy() had a shortcut for length==0 which
>>> prevented the throwing of an ArrayStoreException if src and dst arrays
>>> have incompatible type (see do_test2() in the new regression test).
>>> Note that this is a different error from 8160591 and not fixed by the
>>> change for 8160591.
>>>
>>> I've also moved the new check after the offset + length check as
>>> suggested by you (see new webrev).
>>>
>>> Unfortunately, the new check is still not eliminated. Here's how it looks:
>>>
>>> 0ae   B6: #    B20 B7 <- B5  Freq: 0,999997
>>> 0ae       movl    R9, [rsp + #0]    # spill
>>> 0b2       testl   R9, R9
>>> 0b5       jl     B20  P=0,000001 C=-1,000000
>>> 0b5
>>> 0bb   B7: #    B12 B8 <- B6  Freq: 0,999996
>>> 0bb       movl    R11, [R10 + #8 (8-bit)]    # compressed klass ptr
>>> 0bf       decode_klass_not_null RAX,R11
>>> 0cc       movl    RBX, [RAX + #16 (8-bit)]    # int
>>> 0cf       movslq  RCX, RBX    # i2l
>>> 0d2       movq    RSI, precise klass [Ljava/lang/Object;:
>>> 0x00007ff1080320d0:Constant:exact *    # ptr
>>> 0dc       movq    RCX, [RSI + RCX]    # class
>>> 0e0       cmpq    RAX, RCX    # ptr
>>> 0e3       jne,us  B12  P=0,170000 C=-1,000000
>>> 0e3
>>> 0e5   B8: #    B21 B9 <- B7 B13 B14  Freq: 0,999996
>>> 0e5       testl   R9, R9
>>> 0e8       jle     B21  P=0,000001 C=-1,000000
>>>
>>> As you can see 'testl   R9, R9' is executed two times.
>>>
>>> I've even tried to move the new check after the subtype check, but
>>> that doesn't helps either:
>>>
>>> 0da   B7: #    B20 B8 <- B6 B13 B14  Freq: 0,999997
>>> 0da       movl    R11, [rsp + #8]    # spill
>>> 0df       testl   R11, R11
>>> 0e2       jl     B20  P=0,000001 C=-1,000000
>>> 0e2
>>> 0e8   B8: #    B10 B9 <- B7  Freq: 0,999996
>>> 0e8       testl   R11, R11
>>> 0eb       jle,s   B10  P=0,000001 C=-1,000000
>>>
>>> Any idea how this could be fixed?
>>>
>>> Thanks,
>>> Volker
>>>
>>> PS: and I still don't have a reproducible benchmark which shows a
>>> regression with my change...
>>>
>>>
>>>> Actually additional dynamic check will help in case of negative length is
>>>> know during compilation. The allocation code will be eliminated very
>>>> early
>>>> instead of waiting macro expansion:
>>>>
>>>>        int length = alloc->in(AllocateNode::ALength)->find_int_con(-1);
>>>>        if (length < 0) {
>>>>          NOT_PRODUCT(fail_eliminate = "Array's size is not constant";)
>>>>          can_eliminate = false;
>>>>        }
>>>>
>>>> About additional length check in your new test. I think it may be
>>>> collapsed
>>>> with preceding check since it is generated after other checks.
>>>> So I would suggest to move it after offset + length check.
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>>
>>>> On 8/16/16 7:57 AM, Volker Simonis wrote:
>>>>>
>>>>>
>>>>> On Tue, Aug 16, 2016 at 7:24 AM, Tobias Hartmann
>>>>> <tobias.hartmann at oracle.com> wrote:
>>>>>>
>>>>>>
>>>>>> Hi Volker,
>>>>>>
>>>>>> thanks for taking care of this issue!
>>>>>>
>>>>>> Did you check what happens if the allocation is not eliminated and
>>>>>> macro
>>>>>> expansion phase emits another negative guard? Are the checks merged?
>>>>>>
>>>>>
>>>>> It depends. I just saw that in some cases the regression test worked
>>>>> before, because the length check was done in
>>>>> SharedRuntime::slow_arraycopy_C(). So in that case there's obviously
>>>>> nothing that can be merged. But the test case is obviously a
>>>>> degenerated example anyway, so I don't think that's a problem.
>>>>>
>>>>> If I do a more real-world example like this where the arracopy can not
>>>>> be eliminated because one of its arguments escapes:
>>>>>
>>>>>    public static boolean do_test2(int length, Object[] dest) {
>>>>>      try {
>>>>>        System.arraycopy(new Object[10], 1, dest, 1, length);
>>>>>        return false;
>>>>>      } catch (IndexOutOfBoundsException e) {
>>>>>        return true;
>>>>>      }
>>>>>    }
>>>>>
>>>>> and call it with:
>>>>>
>>>>> do_test2(8, new Object[10])
>>>>>
>>>>> the generated code for do_test2() unfortunately contains one more
>>>>> check now with my change (the 'length' field is in [rsp + #0]):
>>>>>
>>>>> 0a2   B4: #    B18 B5 <- B3  Freq: 0,999999
>>>>> 0a2       movl    R9, [rsp + #0]    # spill
>>>>> 0a6       testl   R9, R9
>>>>> 0a9       jl     B18  P=0,000001 C=-1,000000
>>>>> 0a9
>>>>> 0af   B5: #    B18 B6 <- B4  Freq: 0,999998
>>>>> 0af       movl    RBX, R9    # spill
>>>>> 0b2       incl    RBX    # int
>>>>> 0b4       cmpl    RBX, #10    # unsigned
>>>>> 0b7       jnbe,u  B18  P=0,000001 C=-1,000000
>>>>>
>>>>> The generated code before my change looked like this (againthe
>>>>> 'length' field is in [rsp + #0]):
>>>>>
>>>>> 0a1   B4: #    B17 B5 <- B3  Freq: 0,999999
>>>>> 0a1       movl    R11, [rsp + #8]    # spill
>>>>> 0a6       incl    R11    # int
>>>>> 0a9       cmpl    R11, #10    # unsigned
>>>>> 0ad       jnbe,u  B17  P=0,000001 C=-1,000000
>>>>>
>>>>> It seems that the 'length' check has been completely eliminated before.
>>>>>
>>>>> So I need to do some more tests to understand why the new check isn't
>>>>> eliminated.
>>>>>
>>>>> Do you think the new check results in a performance regression? Have
>>>>> you run some benchmarks?
>>>>>
>>>>>> I would prefer brackets around the if body but you don't need to send
>>>>>> another webrev:
>>>>>>   if (EliminateAllocations) {
>>>>>>     generate_negative_guard(length, slow_region);
>>>>>>   }
>>>>>
>>>>>
>>>>>
>>>>> Yes, I agree.
>>>>>
>>>>>>
>>>>>> Best regards,
>>>>>> Tobias
>>>>>>
>>>>>> On 12.08.2016 21:13, Volker Simonis wrote:
>>>>>>>
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> can I please have a review and sponsor for the following fix:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8159611
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8159611
>>>>>>>
>>>>>>>
>>>>>>> We are inserting several checks for the arguments of
>>>>>>> System.arraycopy() in LibraryCallKit::inline_arraycopy() before
>>>>>>> intensifying the call in LibraryCallKit::inline_arraycopy. However the
>>>>>>> check for the 'length' argument of arracopy is postponed to the macro
>>>>>>> expansion phase in PhaseMacroExpand::generate_arraycopy().
>>>>>>>
>>>>>>> But if we are running with EscapeAnalysis and EliminateAllocations,
>>>>>>> the array allocations inside a call to System.arraycopy() may get
>>>>>>> eliminated and thus the complete call to System.arraycopy() will be
>>>>>>> removed (see PhaseMacroExpand::process_users_of_allocation). In this
>>>>>>> case the extra 'length' check won't be added by
>>>>>>> PhaseMacroExpand::generate_arraycopy() any more because macro
>>>>>>> expansion happens after the elimination of macro nodes.
>>>>>>>
>>>>>>> In such a case it may happen that System.arraycopy() will silently
>>>>>>> accept an invalid (i.e. negative) 'length' parameter, although it
>>>>>>> should actually throw an ArrayOutOfBounds exception.
>>>>>>>
>>>>>>> The fix is simple: also insert a check for the length field in
>>>>>>> LibraryCallKit::inline_arraycopy() if we are running with
>>>>>>> EliminateAllocations.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Volker
>>>>>>>
>>>>
>>

From vitalyd at gmail.com  Tue Sep 13 17:51:46 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Tue, 13 Sep 2016 13:51:46 -0400
Subject: MaxBCEAEstimateSize and inlining clarification
In-Reply-To: <00C16B65-A85F-491E-9384-1172735D9952@gmail.com>
References: <00C16B65-A85F-491E-9384-1172735D9952@gmail.com>
Message-ID: <CAHjP37Es_HUcbSHvySoCJqb7-hNjtLSCxyHi6C6rmY7gqqVOnw@mail.gmail.com>

On Tuesday, September 13, 2016, Cheremin Ruslan <cheremin at gmail.com> wrote:

> > I'm seeing some code that iterates over a ConcurrentHashMap's entrySet
> that allocates tens of GB of CHM$MapEntry objects even though they don't
> escape
>
>
> I'm a bit confused: I was sure BCEA-style params do affect EA, but don't
> affect scalar replacement. With bcEscapeAnalyser you can get (sort of)
> inter-procedural EA, but this only allows you to have more allocations
> identified as ArgEscape instead of GlobalEscape. But you can't get more
> NoEscape without real inlining. ArgEscape (afaik) is used only for
> synchronization removals in HotSpot, not for scalar replacements.
>
> Am I incorrect?

That's my understanding as well (and matches what I'm seeing in some
synthetic test harnesses).

I'm generally seeing a lot of variability in scalar replacement in
particular, all driven by profile data.  HashMap<Integer, ...>::get(int)
sometimes works at eliminating the box and sometimes doesn't - the
difference appears to be whether Integer::equals is inlined or not, which
in turn depends on whether the lookup finds something or not and whether
the number of successful lookups reaches compilation threshold. It's pretty
brittle, sadly, and more importantly, unstable.


> ----
> Ruslan


-- 
Sent from my phone
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160913/01b5f83d/attachment.html>

From cheremin at gmail.com  Tue Sep 13 18:25:09 2016
From: cheremin at gmail.com (Ruslan Cheremin)
Date: Tue, 13 Sep 2016 21:25:09 +0300
Subject: MaxBCEAEstimateSize and inlining clarification
In-Reply-To: <CAHjP37Es_HUcbSHvySoCJqb7-hNjtLSCxyHi6C6rmY7gqqVOnw@mail.gmail.com>
References: <00C16B65-A85F-491E-9384-1172735D9952@gmail.com>
	<CAHjP37Es_HUcbSHvySoCJqb7-hNjtLSCxyHi6C6rmY7gqqVOnw@mail.gmail.com>
Message-ID: <CAOwENiKkrARDOur8gMbit_zrxQDHcw+ABEg29RDvvMueD5xzbA@mail.gmail.com>

>That's my understanding as well (and matches what I'm seeing in some
synthetic test harnesses).

Ok, I just tried to clear it out, because it is not the first time I see
BCEA... noted in context of scalar replacement, and I start to doubt my
eyes :)

>t's pretty brittle, sadly, and more importantly, unstable.

Making similar experiments I see the same. E.g. HashMap.get(TupleKey)
lookup can be successfully scalarized 99% cases, but scalarization become
broken once with slightly changed key generation schema -- because
hashcodes distribution becomes worse, and HashMap buckets start to convert
themself to TreeBins, and TreeBins code is much harder task for EA.

Another can of worms is mismatch between different inlining heuristics.
E.g. FreqInlineSize and InlineSmallCode thresholds may give different
decision for the same piece of code, and taken inlining decision depends on
was method already compiled or not -- which depends on thinnest details of
initialization order and execution profile. This scenarios becomes rare in
1.8 with InlineSmallCode increased, but I'm not sure they are gone...

Currently, I'm starting to think code needs to be specifically written for
EA/SR in mind to be more-or-less stably scalarized. I.e. you can't get it
for free (or it will be unstable).

----
Ruslan


2016-09-13 20:51 GMT+03:00 Vitaly Davidovich <vitalyd at gmail.com>:

>
>
> On Tuesday, September 13, 2016, Cheremin Ruslan <cheremin at gmail.com>
> wrote:
>
>> > I'm seeing some code that iterates over a ConcurrentHashMap's entrySet
>> that allocates tens of GB of CHM$MapEntry objects even though they don't
>> escape
>>
>>
>> I'm a bit confused: I was sure BCEA-style params do affect EA, but don't
>> affect scalar replacement. With bcEscapeAnalyser you can get (sort of)
>> inter-procedural EA, but this only allows you to have more allocations
>> identified as ArgEscape instead of GlobalEscape. But you can't get more
>> NoEscape without real inlining. ArgEscape (afaik) is used only for
>> synchronization removals in HotSpot, not for scalar replacements.
>>
>> Am I incorrect?
>
> That's my understanding as well (and matches what I'm seeing in some
> synthetic test harnesses).
>
> I'm generally seeing a lot of variability in scalar replacement in
> particular, all driven by profile data.  HashMap<Integer, ...>::get(int)
> sometimes works at eliminating the box and sometimes doesn't - the
> difference appears to be whether Integer::equals is inlined or not, which
> in turn depends on whether the lookup finds something or not and whether
> the number of successful lookups reaches compilation threshold. It's pretty
> brittle, sadly, and more importantly, unstable.
>
>
>
>> ----
>> Ruslan
>
>
>
> --
> Sent from my phone
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160913/d332454a/attachment.html>

From vitalyd at gmail.com  Tue Sep 13 18:33:51 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Tue, 13 Sep 2016 14:33:51 -0400
Subject: MaxBCEAEstimateSize and inlining clarification
In-Reply-To: <CAOwENiKkrARDOur8gMbit_zrxQDHcw+ABEg29RDvvMueD5xzbA@mail.gmail.com>
References: <00C16B65-A85F-491E-9384-1172735D9952@gmail.com>
	<CAHjP37Es_HUcbSHvySoCJqb7-hNjtLSCxyHi6C6rmY7gqqVOnw@mail.gmail.com>
	<CAOwENiKkrARDOur8gMbit_zrxQDHcw+ABEg29RDvvMueD5xzbA@mail.gmail.com>
Message-ID: <CAHjP37G5tD8in2o42-KwUffUQhGeueL6FkAfLcpbHB6zA8bf2Q@mail.gmail.com>

On Tue, Sep 13, 2016 at 2:25 PM, Ruslan Cheremin <cheremin at gmail.com> wrote:

> >That's my understanding as well (and matches what I'm seeing in some
> synthetic test harnesses).
>
> Ok, I just tried to clear it out, because it is not the first time I see
> BCEA... noted in context of scalar replacement, and I start to doubt my
> eyes :)
>
> >t's pretty brittle, sadly, and more importantly, unstable.
>
> Making similar experiments I see the same. E.g. HashMap.get(TupleKey)
> lookup can be successfully scalarized 99% cases, but scalarization become
> broken once with slightly changed key generation schema -- because
> hashcodes distribution becomes worse, and HashMap buckets start to convert
> themself to TreeBins, and TreeBins code is much harder task for EA.
>
> Another can of worms is mismatch between different inlining heuristics.
> E.g. FreqInlineSize and InlineSmallCode thresholds may give different
> decision for the same piece of code, and taken inlining decision depends on
> was method already compiled or not -- which depends on thinnest details of
> initialization order and execution profile. This scenarios becomes rare in
> 1.8 with InlineSmallCode increased, but I'm not sure they are gone...
>
> Currently, I'm starting to think code needs to be specifically written for
> EA/SR in mind to be more-or-less stably scalarized. I.e. you can't get it
> for free (or it will be unstable).
>
I'm not sure this is practical, to be honest, at least for a big enough
application.  I've long considered EA (and scalar replacement) as a bonus
optimization, and never to rely on it if the allocations would hurt
otherwise.  I'm just a bit surprised *just* how unstable it appears to be,
in the "simplest" of cases.

I think code can be written to increase likelihood of scalar replacement,
but I just can't see how it can be made stable to the point where you can
rely/depend on it for performance.

>
> ----
> Ruslan
>
>
> 2016-09-13 20:51 GMT+03:00 Vitaly Davidovich <vitalyd at gmail.com>:
>
>>
>>
>> On Tuesday, September 13, 2016, Cheremin Ruslan <cheremin at gmail.com>
>> wrote:
>>
>>> > I'm seeing some code that iterates over a ConcurrentHashMap's entrySet
>>> that allocates tens of GB of CHM$MapEntry objects even though they don't
>>> escape
>>>
>>>
>>> I'm a bit confused: I was sure BCEA-style params do affect EA, but don't
>>> affect scalar replacement. With bcEscapeAnalyser you can get (sort of)
>>> inter-procedural EA, but this only allows you to have more allocations
>>> identified as ArgEscape instead of GlobalEscape. But you can't get more
>>> NoEscape without real inlining. ArgEscape (afaik) is used only for
>>> synchronization removals in HotSpot, not for scalar replacements.
>>>
>>> Am I incorrect?
>>
>> That's my understanding as well (and matches what I'm seeing in some
>> synthetic test harnesses).
>>
>> I'm generally seeing a lot of variability in scalar replacement in
>> particular, all driven by profile data.  HashMap<Integer, ...>::get(int)
>> sometimes works at eliminating the box and sometimes doesn't - the
>> difference appears to be whether Integer::equals is inlined or not, which
>> in turn depends on whether the lookup finds something or not and whether
>> the number of successful lookups reaches compilation threshold. It's pretty
>> brittle, sadly, and more importantly, unstable.
>>
>>
>>
>>> ----
>>> Ruslan
>>
>>
>>
>> --
>> Sent from my phone
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160913/5a059761/attachment.html>

From cheremin at gmail.com  Tue Sep 13 19:32:18 2016
From: cheremin at gmail.com (Ruslan Cheremin)
Date: Tue, 13 Sep 2016 22:32:18 +0300
Subject: MaxBCEAEstimateSize and inlining clarification
In-Reply-To: <CAHjP37G5tD8in2o42-KwUffUQhGeueL6FkAfLcpbHB6zA8bf2Q@mail.gmail.com>
References: <00C16B65-A85F-491E-9384-1172735D9952@gmail.com>
	<CAHjP37Es_HUcbSHvySoCJqb7-hNjtLSCxyHi6C6rmY7gqqVOnw@mail.gmail.com>
	<CAOwENiKkrARDOur8gMbit_zrxQDHcw+ABEg29RDvvMueD5xzbA@mail.gmail.com>
	<CAHjP37G5tD8in2o42-KwUffUQhGeueL6FkAfLcpbHB6zA8bf2Q@mail.gmail.com>
Message-ID: <CAOwENi+Ko64mUojCW_qR=n_J+N2WOjxb6KdrHLJ5vF_Wyqi+Sw@mail.gmail.com>

>how it can be made stable to the point where you can rely/depend on it for
performance.

Well, same can be said about any JIT optimization -- (may be it is time to
rename dynamic runtime to stochastic runtime?). Personally I see SR to be
the same order of stability as inlining. Actually, apart from few
SR-specific issues (like with merge points), EA/SR mostly follow inlining:
if you have enough scope inlined you'll have, say, 80% chance of SR. From
my perspective it is inlining which is so surprisingly unstable.

BTW: have you considered to share you experience with EA/SR pitfalls? Even
if "increase likelihood" is the best option available -- there are still
very little information about it in the net.

----
Ruslan


2016-09-13 21:33 GMT+03:00 Vitaly Davidovich <vitalyd at gmail.com>:

>
>
> On Tue, Sep 13, 2016 at 2:25 PM, Ruslan Cheremin <cheremin at gmail.com>
> wrote:
>
>> >That's my understanding as well (and matches what I'm seeing in some
>> synthetic test harnesses).
>>
>> Ok, I just tried to clear it out, because it is not the first time I see
>> BCEA... noted in context of scalar replacement, and I start to doubt my
>> eyes :)
>>
>> >t's pretty brittle, sadly, and more importantly, unstable.
>>
>> Making similar experiments I see the same. E.g. HashMap.get(TupleKey)
>> lookup can be successfully scalarized 99% cases, but scalarization become
>> broken once with slightly changed key generation schema -- because
>> hashcodes distribution becomes worse, and HashMap buckets start to convert
>> themself to TreeBins, and TreeBins code is much harder task for EA.
>>
>> Another can of worms is mismatch between different inlining heuristics.
>> E.g. FreqInlineSize and InlineSmallCode thresholds may give different
>> decision for the same piece of code, and taken inlining decision depends on
>> was method already compiled or not -- which depends on thinnest details of
>> initialization order and execution profile. This scenarios becomes rare in
>> 1.8 with InlineSmallCode increased, but I'm not sure they are gone...
>>
>> Currently, I'm starting to think code needs to be specifically written
>> for EA/SR in mind to be more-or-less stably scalarized. I.e. you can't get
>> it for free (or it will be unstable).
>>
> I'm not sure this is practical, to be honest, at least for a big enough
> application.  I've long considered EA (and scalar replacement) as a bonus
> optimization, and never to rely on it if the allocations would hurt
> otherwise.  I'm just a bit surprised *just* how unstable it appears to be,
> in the "simplest" of cases.
>
> I think code can be written to increase likelihood of scalar replacement,
> but I just can't see how it can be made stable to the point where you can
> rely/depend on it for performance.
>
>>
>> ----
>> Ruslan
>>
>>
>> 2016-09-13 20:51 GMT+03:00 Vitaly Davidovich <vitalyd at gmail.com>:
>>
>>>
>>>
>>> On Tuesday, September 13, 2016, Cheremin Ruslan <cheremin at gmail.com>
>>> wrote:
>>>
>>>> > I'm seeing some code that iterates over a ConcurrentHashMap's
>>>> entrySet that allocates tens of GB of CHM$MapEntry objects even though they
>>>> don't escape
>>>>
>>>>
>>>> I'm a bit confused: I was sure BCEA-style params do affect EA, but
>>>> don't affect scalar replacement. With bcEscapeAnalyser you can get (sort
>>>> of) inter-procedural EA, but this only allows you to have more allocations
>>>> identified as ArgEscape instead of GlobalEscape. But you can't get more
>>>> NoEscape without real inlining. ArgEscape (afaik) is used only for
>>>> synchronization removals in HotSpot, not for scalar replacements.
>>>>
>>>> Am I incorrect?
>>>
>>> That's my understanding as well (and matches what I'm seeing in some
>>> synthetic test harnesses).
>>>
>>> I'm generally seeing a lot of variability in scalar replacement in
>>> particular, all driven by profile data.  HashMap<Integer, ...>::get(int)
>>> sometimes works at eliminating the box and sometimes doesn't - the
>>> difference appears to be whether Integer::equals is inlined or not, which
>>> in turn depends on whether the lookup finds something or not and whether
>>> the number of successful lookups reaches compilation threshold. It's pretty
>>> brittle, sadly, and more importantly, unstable.
>>>
>>>
>>>
>>>> ----
>>>> Ruslan
>>>
>>>
>>>
>>> --
>>> Sent from my phone
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160913/7a4ac6d2/attachment-0001.html>

From vitalyd at gmail.com  Tue Sep 13 19:44:05 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Tue, 13 Sep 2016 15:44:05 -0400
Subject: MaxBCEAEstimateSize and inlining clarification
In-Reply-To: <CAOwENi+Ko64mUojCW_qR=n_J+N2WOjxb6KdrHLJ5vF_Wyqi+Sw@mail.gmail.com>
References: <00C16B65-A85F-491E-9384-1172735D9952@gmail.com>
	<CAHjP37Es_HUcbSHvySoCJqb7-hNjtLSCxyHi6C6rmY7gqqVOnw@mail.gmail.com>
	<CAOwENiKkrARDOur8gMbit_zrxQDHcw+ABEg29RDvvMueD5xzbA@mail.gmail.com>
	<CAHjP37G5tD8in2o42-KwUffUQhGeueL6FkAfLcpbHB6zA8bf2Q@mail.gmail.com>
	<CAOwENi+Ko64mUojCW_qR=n_J+N2WOjxb6KdrHLJ5vF_Wyqi+Sw@mail.gmail.com>
Message-ID: <CAHjP37Hi66O+xZ8JKKEkReOODUqCx9iw_cNtXDY6dAUZEW7icw@mail.gmail.com>

On Tue, Sep 13, 2016 at 3:32 PM, Ruslan Cheremin <cheremin at gmail.com> wrote:

> >how it can be made stable to the point where you can rely/depend on it
> for performance.
>
> Well, same can be said about any JIT optimization -- (may be it is time to
> rename dynamic runtime to stochastic runtime?). Personally I see SR to be
> the same order of stability as inlining. Actually, apart from few
> SR-specific issues (like with merge points), EA/SR mostly follow inlining:
> if you have enough scope inlined you'll have, say, 80% chance of SR. From
> my perspective it is inlining which is so surprisingly unstable.
>
Yeah, I'd agree.  The difference, in my mind, is failing to inline a
function may not have as drastic performance implications as failing to
eliminate temporaries.

>
> BTW: have you considered to share you experience with EA/SR pitfalls? Even
> if "increase likelihood" is the best option available -- there are still
> very little information about it in the net.
>
I'm kind of doing that via the few emails on this list :).  I think you
pretty much covered the biggest (apparent) flake in the equation -
inlining, which can fail for all sorts of different reasons.  Beyond that,
there's the control flow insensitive aspect of the EA, which is
tangentially related to inlining (or lack thereof).

There was also another thread a few months back where I was asking why a
small local array allocation wasn't scalarized, and the answer there was
ordering between loop unrolling and EA passes (I can dig up that thread if
you're interested).  The bizarre thing there was the loop operation was
folded into a constant, and the compiled method was returning a constant
value, but the array allocation was left behind (although it wasn't needed).

I agree that there isn't much information about EA in Hotspot (there's a
lot of handwaving and inaccuracies online).  In particular, it'd be nice if
the performance wiki had a section on making user code play well with EA
(just like it has guidance on some other JIT aspects currently).

>
> ----
> Ruslan
>
>
>
> 2016-09-13 21:33 GMT+03:00 Vitaly Davidovich <vitalyd at gmail.com>:
>
>>
>>
>> On Tue, Sep 13, 2016 at 2:25 PM, Ruslan Cheremin <cheremin at gmail.com>
>> wrote:
>>
>>> >That's my understanding as well (and matches what I'm seeing in some
>>> synthetic test harnesses).
>>>
>>> Ok, I just tried to clear it out, because it is not the first time I see
>>> BCEA... noted in context of scalar replacement, and I start to doubt my
>>> eyes :)
>>>
>>> >t's pretty brittle, sadly, and more importantly, unstable.
>>>
>>> Making similar experiments I see the same. E.g. HashMap.get(TupleKey)
>>> lookup can be successfully scalarized 99% cases, but scalarization become
>>> broken once with slightly changed key generation schema -- because
>>> hashcodes distribution becomes worse, and HashMap buckets start to convert
>>> themself to TreeBins, and TreeBins code is much harder task for EA.
>>>
>>> Another can of worms is mismatch between different inlining heuristics.
>>> E.g. FreqInlineSize and InlineSmallCode thresholds may give different
>>> decision for the same piece of code, and taken inlining decision depends on
>>> was method already compiled or not -- which depends on thinnest details of
>>> initialization order and execution profile. This scenarios becomes rare in
>>> 1.8 with InlineSmallCode increased, but I'm not sure they are gone...
>>>
>>> Currently, I'm starting to think code needs to be specifically written
>>> for EA/SR in mind to be more-or-less stably scalarized. I.e. you can't get
>>> it for free (or it will be unstable).
>>>
>> I'm not sure this is practical, to be honest, at least for a big enough
>> application.  I've long considered EA (and scalar replacement) as a bonus
>> optimization, and never to rely on it if the allocations would hurt
>> otherwise.  I'm just a bit surprised *just* how unstable it appears to be,
>> in the "simplest" of cases.
>>
>> I think code can be written to increase likelihood of scalar replacement,
>> but I just can't see how it can be made stable to the point where you can
>> rely/depend on it for performance.
>>
>>>
>>> ----
>>> Ruslan
>>>
>>>
>>> 2016-09-13 20:51 GMT+03:00 Vitaly Davidovich <vitalyd at gmail.com>:
>>>
>>>>
>>>>
>>>> On Tuesday, September 13, 2016, Cheremin Ruslan <cheremin at gmail.com>
>>>> wrote:
>>>>
>>>>> > I'm seeing some code that iterates over a ConcurrentHashMap's
>>>>> entrySet that allocates tens of GB of CHM$MapEntry objects even though they
>>>>> don't escape
>>>>>
>>>>>
>>>>> I'm a bit confused: I was sure BCEA-style params do affect EA, but
>>>>> don't affect scalar replacement. With bcEscapeAnalyser you can get (sort
>>>>> of) inter-procedural EA, but this only allows you to have more allocations
>>>>> identified as ArgEscape instead of GlobalEscape. But you can't get more
>>>>> NoEscape without real inlining. ArgEscape (afaik) is used only for
>>>>> synchronization removals in HotSpot, not for scalar replacements.
>>>>>
>>>>> Am I incorrect?
>>>>
>>>> That's my understanding as well (and matches what I'm seeing in some
>>>> synthetic test harnesses).
>>>>
>>>> I'm generally seeing a lot of variability in scalar replacement in
>>>> particular, all driven by profile data.  HashMap<Integer, ...>::get(int)
>>>> sometimes works at eliminating the box and sometimes doesn't - the
>>>> difference appears to be whether Integer::equals is inlined or not, which
>>>> in turn depends on whether the lookup finds something or not and whether
>>>> the number of successful lookups reaches compilation threshold. It's pretty
>>>> brittle, sadly, and more importantly, unstable.
>>>>
>>>>
>>>>
>>>>> ----
>>>>> Ruslan
>>>>
>>>>
>>>>
>>>> --
>>>> Sent from my phone
>>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160913/ccf4f99d/attachment.html>

From vladimir.kozlov at oracle.com  Tue Sep 13 19:52:24 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 13 Sep 2016 12:52:24 -0700
Subject: please sponsor? RFR(M): 8165235: [TESTBUG] RTM tests must check
	OS version
In-Reply-To: <d509c4eeb57949c88f4b3df0b6f35409@DEWDFE13DE50.global.corp.sap>
References: <d509c4eeb57949c88f4b3df0b6f35409@DEWDFE13DE50.global.corp.sap>
Message-ID: <57D858F8.1010807@oracle.com>

Submitted to JPRT.

Thanks,
Vladimir

On 9/8/16 7:38 AM, Lindenmaier, Goetz wrote:
> Hi,
>
> This change was reviewed by Volker Simonis and Fillipp Zhinkin.
> Final webrevs:
> http://cr.openjdk.java.net/~goetz/wr16/8165235-osRecog/03/webrev.bs/
> http://cr.openjdk.java.net/~goetz/wr16/8165235-osRecog/03/webrev.hs/
>
> Could someone please sponsor?
>
> Thanks!
>    Goetz
>
>> -----Original Message-----
>> From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-
>> bounces at openjdk.java.net] On Behalf Of Lindenmaier, Goetz
>> Sent: Montag, 5. September 2016 13:55
>> To: hotspot-compiler-dev at openjdk.java.net
>> Subject: RFR(M): 8165235: [TESTBUG] RTM tests must check OS version
>>
>> Hi,
>>
>>
>>
>> This fixes the RTM tests wrt. to supported platforms on ppc.
>>
>> Please review this change. I please need a sponsor.
>> http://cr.openjdk.java.net/~goetz/wr16/8165235-osRecog/01/webrev.bs/
>>
>> http://cr.openjdk.java.net/~goetz/wr16/8165235-osRecog/01/webrev.hs/
>>
>>
>> RTM uses special instructions that are only available on recent x86 cpus. On
>> x86, this feature does not need OS support. On ppc, the equivalent
>> functionality, hardware transactional memory, requires OS support. Thus the
>> feature is only enabled by the VM if CPU and OS are at a specific level. The
>> tests must check this. too. This holds for AIX and Linux.
>>
>>
>>
>> To do so, this change introduces rtm/predicate/SupportedOS.java which
>> checks for proper OS versions on ppc, else returns true.
>>
>> The OS version is retrieved from Platform.java, which has new methods
>> getOsVersionMajor() and  getOsVersionMinor().
>>
>> To simplify the checks in the tests, I also introduced a 3-way AndPredicate
>> constructor.
>>
>>
>>
>> To simplify the OS version check on Aix, I change enabling RTM on Aix to
>> require AIX 7.2.
>>
>> Before, it was enabled on AIX 7.1.3.30, which contains an important bug fix.
>> The
>>
>> last digits of this version are not exported to os.version property, so I can not
>>
>> check for them in the test.
>>
>>
>>
>> Best regards,
>>
>>    Goetz.
>

From vitalyd at gmail.com  Tue Sep 13 19:54:27 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Tue, 13 Sep 2016 15:54:27 -0400
Subject: Odd interaction between ArrayList$Itr and Escape Analysis
In-Reply-To: <1619527975.952230.1473776309365.JavaMail.zimbra@u-pem.fr>
References: <CAHjP37HcvD9JkhwrAjkNpV+vawyc1A=QgYD51SMLrLnJ90J4rw@mail.gmail.com>
	<CA+cQ+tQEeRDKsn0VffLfhXcbUqYd_vehtK9sEussoOkmTux4cA@mail.gmail.com>
	<CAHjP37HKRe9xhhkaWG--t8s=m7ZDvDu57Gt0qCEzi8_Q9K7dTA@mail.gmail.com>
	<CA+cQ+tT4Rx4B_Wr=D0YvmNG4YozsiJRC-tgasv_yTv-M==kYnA@mail.gmail.com>
	<CAHjP37GJQ02mMOT-TtFsoOhRJcjdd08CKp3WxpjYBnYri=7bdw@mail.gmail.com>
	<1619527975.952230.1473776309365.JavaMail.zimbra@u-pem.fr>
Message-ID: <CAHjP37GGLcLOiWjPPsdMS85PvDcncFg9k9D5dDkCLnnq7ij=FA@mail.gmail.com>

On Tuesday, September 13, 2016, Remi Forax <forax at univ-mlv.fr> wrote:

> I've always found that the empty inner classes generated by javac as a
> kind of hack.
>
> These classes should be removed in Java 10, thanks to the nestmate
> attributes.
>   http://mail.openjdk.java.net/pipermail/valhalla-spec-
> experts/2016-January/000060.html
>
> The other solution, is to have an empty class in the jdk which is not
> visible from javac (the class itself can be marked as synthetic),
> so javac can use it without creating method clash.
>
> and to solve the problem now, the easy solution is to add a package
> private constructor in ArrayList.Itr,
>
I'm hoping Oracle can take Kris' (Azul) patch (or do something similar).
It might catch more cases than just modifying Itr.

>
>    private class Itr implements Iterator<E> {     int cursor;       // index of next element to return     int lastRet = -1; // index of last element returned; -1 if no such     int expectedModCount = modCount;
>
>      Itr() {
>          // avoid to generate a synthetic accessor constructor
>      }
>    }
>
>
> regards,
> R?mi
>
> ------------------------------
>
> *De: *"Vitaly Davidovich" <vitalyd at gmail.com
> <javascript:_e(%7B%7D,'cvml','vitalyd at gmail.com');>>
> *?: *"Krystal Mok" <rednaxelafx at gmail.com
> <javascript:_e(%7B%7D,'cvml','rednaxelafx at gmail.com');>>
> *Cc: *"hotspot compiler" <hotspot-compiler-dev at openjdk.java.net
> <javascript:_e(%7B%7D,'cvml','hotspot-compiler-dev at openjdk.java.net');>>
> *Envoy?: *Lundi 12 Septembre 2016 22:15:41
> *Objet: *Re: Odd interaction between ArrayList$Itr and Escape Analysis
>
>
>
> On Mon, Sep 12, 2016 at 3:56 PM, Krystal Mok <rednaxelafx at gmail.com
> <javascript:_e(%7B%7D,'cvml','rednaxelafx at gmail.com');>> wrote:
>
>> On Mon, Sep 12, 2016 at 12:38 PM, Vitaly Davidovich <vitalyd at gmail.com
>> <javascript:_e(%7B%7D,'cvml','vitalyd at gmail.com');>> wrote:
>>>
>>> It seems odd to me as well why inlining won't force load the missing
>>> class(es).  If we're inlining, it means the method itself or the call chain
>>> it's part of is hot - failing to inline can have negative side-effects,
>>> like this example.  I suppose there must be a good reason why it doesn't do
>>> this though?
>>>
>>
>> That's because we can't. The JIT compilers are running on their own
>> threads, and they're not real "Java threads". So they are not allowed to
>> run arbitrary Java code. But Java class loading may involve running
>> arbitrary Java code, e.g. the ClassLoader.loadClass() upcall.
>> Force class loading can be done on the triggering side (for the top-level
>> method), because compilation tasks are triggered from real Java threads,
>> and they're allowed to run arbitrary Java code.
>>
> I see, makes sense.  Perhaps there can be an option to turn on loading of
> required types in the entire compilation unit, after all inlining is done
> (and therefore make the unloaded types not be barriers for inlining).  I'd
> personally prefer that over having odd performance differences.
>
>>
>> - Kris
>>
> Thanks Kris.
>
>

-- 
Sent from my phone
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160913/a5c9d620/attachment-0001.html>

From cheremin at gmail.com  Tue Sep 13 19:55:05 2016
From: cheremin at gmail.com (Ruslan Cheremin)
Date: Tue, 13 Sep 2016 22:55:05 +0300
Subject: MaxBCEAEstimateSize and inlining clarification
In-Reply-To: <CAHjP37Hi66O+xZ8JKKEkReOODUqCx9iw_cNtXDY6dAUZEW7icw@mail.gmail.com>
References: <00C16B65-A85F-491E-9384-1172735D9952@gmail.com>
	<CAHjP37Es_HUcbSHvySoCJqb7-hNjtLSCxyHi6C6rmY7gqqVOnw@mail.gmail.com>
	<CAOwENiKkrARDOur8gMbit_zrxQDHcw+ABEg29RDvvMueD5xzbA@mail.gmail.com>
	<CAHjP37G5tD8in2o42-KwUffUQhGeueL6FkAfLcpbHB6zA8bf2Q@mail.gmail.com>
	<CAOwENi+Ko64mUojCW_qR=n_J+N2WOjxb6KdrHLJ5vF_Wyqi+Sw@mail.gmail.com>
	<CAHjP37Hi66O+xZ8JKKEkReOODUqCx9iw_cNtXDY6dAUZEW7icw@mail.gmail.com>
Message-ID: <CAOwENiKnhQBBDERv8ah5eT8b2mGdhO0EQ9LsmtNBYr4oyB+vDA@mail.gmail.com>

>There was also another thread a few months back where I was asking why a
small local array allocation wasn't scalarized, and the answer there was
ordering between loop unrolling and EA passes (I can dig up that thread if
you're interested).

It would be very nice, please -- I've tried to google it by myself (because
you've noted it already in the thread) but wasn't able to guess right
keywords :)


2016-09-13 22:44 GMT+03:00 Vitaly Davidovich <vitalyd at gmail.com>:

>
>
> On Tue, Sep 13, 2016 at 3:32 PM, Ruslan Cheremin <cheremin at gmail.com>
> wrote:
>
>> >how it can be made stable to the point where you can rely/depend on it
>> for performance.
>>
>> Well, same can be said about any JIT optimization -- (may be it is time
>> to rename dynamic runtime to stochastic runtime?). Personally I see SR to
>> be the same order of stability as inlining. Actually, apart from few
>> SR-specific issues (like with merge points), EA/SR mostly follow inlining:
>> if you have enough scope inlined you'll have, say, 80% chance of SR.
>> From my perspective it is inlining which is so surprisingly unstable.
>>
> Yeah, I'd agree.  The difference, in my mind, is failing to inline a
> function may not have as drastic performance implications as failing to
> eliminate temporaries.
>
>>
>> BTW: have you considered to share you experience with EA/SR pitfalls?
>> Even if "increase likelihood" is the best option available -- there are
>> still very little information about it in the net.
>>
> I'm kind of doing that via the few emails on this list :).  I think you
> pretty much covered the biggest (apparent) flake in the equation -
> inlining, which can fail for all sorts of different reasons.  Beyond that,
> there's the control flow insensitive aspect of the EA, which is
> tangentially related to inlining (or lack thereof).
>
> There was also another thread a few months back where I was asking why a
> small local array allocation wasn't scalarized, and the answer there was
> ordering between loop unrolling and EA passes (I can dig up that thread if
> you're interested).  The bizarre thing there was the loop operation was
> folded into a constant, and the compiled method was returning a constant
> value, but the array allocation was left behind (although it wasn't needed).
>
> I agree that there isn't much information about EA in Hotspot (there's a
> lot of handwaving and inaccuracies online).  In particular, it'd be nice if
> the performance wiki had a section on making user code play well with EA
> (just like it has guidance on some other JIT aspects currently).
>
>>
>> ----
>> Ruslan
>>
>>
>>
>> 2016-09-13 21:33 GMT+03:00 Vitaly Davidovich <vitalyd at gmail.com>:
>>
>>>
>>>
>>> On Tue, Sep 13, 2016 at 2:25 PM, Ruslan Cheremin <cheremin at gmail.com>
>>> wrote:
>>>
>>>> >That's my understanding as well (and matches what I'm seeing in some
>>>> synthetic test harnesses).
>>>>
>>>> Ok, I just tried to clear it out, because it is not the first time I
>>>> see BCEA... noted in context of scalar replacement, and I start to doubt my
>>>> eyes :)
>>>>
>>>> >t's pretty brittle, sadly, and more importantly, unstable.
>>>>
>>>> Making similar experiments I see the same. E.g. HashMap.get(TupleKey)
>>>> lookup can be successfully scalarized 99% cases, but scalarization become
>>>> broken once with slightly changed key generation schema -- because
>>>> hashcodes distribution becomes worse, and HashMap buckets start to convert
>>>> themself to TreeBins, and TreeBins code is much harder task for EA.
>>>>
>>>> Another can of worms is mismatch between different inlining heuristics.
>>>> E.g. FreqInlineSize and InlineSmallCode thresholds may give different
>>>> decision for the same piece of code, and taken inlining decision depends on
>>>> was method already compiled or not -- which depends on thinnest details of
>>>> initialization order and execution profile. This scenarios becomes rare in
>>>> 1.8 with InlineSmallCode increased, but I'm not sure they are gone...
>>>>
>>>> Currently, I'm starting to think code needs to be specifically written
>>>> for EA/SR in mind to be more-or-less stably scalarized. I.e. you can't get
>>>> it for free (or it will be unstable).
>>>>
>>> I'm not sure this is practical, to be honest, at least for a big enough
>>> application.  I've long considered EA (and scalar replacement) as a bonus
>>> optimization, and never to rely on it if the allocations would hurt
>>> otherwise.  I'm just a bit surprised *just* how unstable it appears to be,
>>> in the "simplest" of cases.
>>>
>>> I think code can be written to increase likelihood of scalar
>>> replacement, but I just can't see how it can be made stable to the point
>>> where you can rely/depend on it for performance.
>>>
>>>>
>>>> ----
>>>> Ruslan
>>>>
>>>>
>>>> 2016-09-13 20:51 GMT+03:00 Vitaly Davidovich <vitalyd at gmail.com>:
>>>>
>>>>>
>>>>>
>>>>> On Tuesday, September 13, 2016, Cheremin Ruslan <cheremin at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> > I'm seeing some code that iterates over a ConcurrentHashMap's
>>>>>> entrySet that allocates tens of GB of CHM$MapEntry objects even though they
>>>>>> don't escape
>>>>>>
>>>>>>
>>>>>> I'm a bit confused: I was sure BCEA-style params do affect EA, but
>>>>>> don't affect scalar replacement. With bcEscapeAnalyser you can get (sort
>>>>>> of) inter-procedural EA, but this only allows you to have more allocations
>>>>>> identified as ArgEscape instead of GlobalEscape. But you can't get more
>>>>>> NoEscape without real inlining. ArgEscape (afaik) is used only for
>>>>>> synchronization removals in HotSpot, not for scalar replacements.
>>>>>>
>>>>>> Am I incorrect?
>>>>>
>>>>> That's my understanding as well (and matches what I'm seeing in some
>>>>> synthetic test harnesses).
>>>>>
>>>>> I'm generally seeing a lot of variability in scalar replacement in
>>>>> particular, all driven by profile data.  HashMap<Integer, ...>::get(int)
>>>>> sometimes works at eliminating the box and sometimes doesn't - the
>>>>> difference appears to be whether Integer::equals is inlined or not, which
>>>>> in turn depends on whether the lookup finds something or not and whether
>>>>> the number of successful lookups reaches compilation threshold. It's pretty
>>>>> brittle, sadly, and more importantly, unstable.
>>>>>
>>>>>
>>>>>
>>>>>> ----
>>>>>> Ruslan
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Sent from my phone
>>>>>
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160913/e8ebd018/attachment.html>

From rednaxelafx at gmail.com  Tue Sep 13 20:01:39 2016
From: rednaxelafx at gmail.com (Krystal Mok)
Date: Tue, 13 Sep 2016 13:01:39 -0700
Subject: Odd interaction between ArrayList$Itr and Escape Analysis
In-Reply-To: <CAHjP37GGLcLOiWjPPsdMS85PvDcncFg9k9D5dDkCLnnq7ij=FA@mail.gmail.com>
References: <CAHjP37HcvD9JkhwrAjkNpV+vawyc1A=QgYD51SMLrLnJ90J4rw@mail.gmail.com>
	<CA+cQ+tQEeRDKsn0VffLfhXcbUqYd_vehtK9sEussoOkmTux4cA@mail.gmail.com>
	<CAHjP37HKRe9xhhkaWG--t8s=m7ZDvDu57Gt0qCEzi8_Q9K7dTA@mail.gmail.com>
	<CA+cQ+tT4Rx4B_Wr=D0YvmNG4YozsiJRC-tgasv_yTv-M==kYnA@mail.gmail.com>
	<CAHjP37GJQ02mMOT-TtFsoOhRJcjdd08CKp3WxpjYBnYri=7bdw@mail.gmail.com>
	<1619527975.952230.1473776309365.JavaMail.zimbra@u-pem.fr>
	<CAHjP37GGLcLOiWjPPsdMS85PvDcncFg9k9D5dDkCLnnq7ij=FA@mail.gmail.com>
Message-ID: <CA+cQ+tRJDXRBC2M0_LfCf=XwWs7HrxJKjhsQpSVNW9hjyCY58A@mail.gmail.com>

And I'm happy to upstream that patch, if the team is interested.

Now, when I first discovered the problem, my first intuition was that it's
better to "fix" it in javac. But before nest mates in the Class file, there
isn't much that javac could do. Changing the Java libraries to not use
private constructors in inner classes is also doable, but needs changing a
lot of files.

So I ended up fixing it in the VM, even though I agree fully with what R?mi
brought up.

The access constructor tag thingy in javac is really a weird hack. If you
guys ever look at the contents of ArrayList$1, it's really empty -- the
class doesn't even declare some of the usual structures in a normal Class
file... Hopefully we can get rid of it in javac soon.

- Kris

On Tuesday, September 13, 2016, Vitaly Davidovich <vitalyd at gmail.com> wrote:

>
>
> On Tuesday, September 13, 2016, Remi Forax <forax at univ-mlv.fr
> <javascript:_e(%7B%7D,'cvml','forax at univ-mlv.fr');>> wrote:
>
>> I've always found that the empty inner classes generated by javac as a
>> kind of hack.
>>
>> These classes should be removed in Java 10, thanks to the nestmate
>> attributes.
>>   http://mail.openjdk.java.net/pipermail/valhalla-spec-experts
>> /2016-January/000060.html
>>
>> The other solution, is to have an empty class in the jdk which is not
>> visible from javac (the class itself can be marked as synthetic),
>> so javac can use it without creating method clash.
>>
>> and to solve the problem now, the easy solution is to add a package
>> private constructor in ArrayList.Itr,
>>
> I'm hoping Oracle can take Kris' (Azul) patch (or do something similar).
> It might catch more cases than just modifying Itr.
>
>>
>>    private class Itr implements Iterator<E> {     int cursor;       // index of next element to return     int lastRet = -1; // index of last element returned; -1 if no such     int expectedModCount = modCount;
>>
>>      Itr() {
>>          // avoid to generate a synthetic accessor constructor
>>      }
>>    }
>>
>>
>> regards,
>> R?mi
>>
>> ------------------------------
>>
>> *De: *"Vitaly Davidovich" <vitalyd at gmail.com>
>> *?: *"Krystal Mok" <rednaxelafx at gmail.com>
>> *Cc: *"hotspot compiler" <hotspot-compiler-dev at openjdk.java.net>
>> *Envoy?: *Lundi 12 Septembre 2016 22:15:41
>> *Objet: *Re: Odd interaction between ArrayList$Itr and Escape Analysis
>>
>>
>>
>> On Mon, Sep 12, 2016 at 3:56 PM, Krystal Mok <rednaxelafx at gmail.com>
>> wrote:
>>
>>> On Mon, Sep 12, 2016 at 12:38 PM, Vitaly Davidovich <vitalyd at gmail.com>
>>> wrote:
>>>>
>>>> It seems odd to me as well why inlining won't force load the missing
>>>> class(es).  If we're inlining, it means the method itself or the call chain
>>>> it's part of is hot - failing to inline can have negative side-effects,
>>>> like this example.  I suppose there must be a good reason why it doesn't do
>>>> this though?
>>>>
>>>
>>> That's because we can't. The JIT compilers are running on their own
>>> threads, and they're not real "Java threads". So they are not allowed to
>>> run arbitrary Java code. But Java class loading may involve running
>>> arbitrary Java code, e.g. the ClassLoader.loadClass() upcall.
>>> Force class loading can be done on the triggering side (for the
>>> top-level method), because compilation tasks are triggered from real Java
>>> threads, and they're allowed to run arbitrary Java code.
>>>
>> I see, makes sense.  Perhaps there can be an option to turn on loading of
>> required types in the entire compilation unit, after all inlining is done
>> (and therefore make the unloaded types not be barriers for inlining).  I'd
>> personally prefer that over having odd performance differences.
>>
>>>
>>> - Kris
>>>
>> Thanks Kris.
>>
>>
>
> --
> Sent from my phone
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160913/438f0b79/attachment-0001.html>

From vladimir.kozlov at oracle.com  Tue Sep 13 20:15:06 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 13 Sep 2016 13:15:06 -0700
Subject: MaxBCEAEstimateSize and inlining clarification
In-Reply-To: <CAOwENiKnhQBBDERv8ah5eT8b2mGdhO0EQ9LsmtNBYr4oyB+vDA@mail.gmail.com>
References: <00C16B65-A85F-491E-9384-1172735D9952@gmail.com>
	<CAHjP37Es_HUcbSHvySoCJqb7-hNjtLSCxyHi6C6rmY7gqqVOnw@mail.gmail.com>
	<CAOwENiKkrARDOur8gMbit_zrxQDHcw+ABEg29RDvvMueD5xzbA@mail.gmail.com>
	<CAHjP37G5tD8in2o42-KwUffUQhGeueL6FkAfLcpbHB6zA8bf2Q@mail.gmail.com>
	<CAOwENi+Ko64mUojCW_qR=n_J+N2WOjxb6KdrHLJ5vF_Wyqi+Sw@mail.gmail.com>
	<CAHjP37Hi66O+xZ8JKKEkReOODUqCx9iw_cNtXDY6dAUZEW7icw@mail.gmail.com>
	<CAOwENiKnhQBBDERv8ah5eT8b2mGdhO0EQ9LsmtNBYr4oyB+vDA@mail.gmail.com>
Message-ID: <57D85E4A.2080302@oracle.com>

If allocation is done locally in loop it could be SR (but not guaranteed):

for () {
   Foo f = new Foo();
}

"Currently" we can't SR it if there is merge:

Foo f = new Foo();
for () {
   f = new Foo();
}
x = f.x;

Also we can't SR an array if it has index access because we can't map loads/stores to concrete element:

int[] a = new int[3];
for (i) {
   x = a[i]
}

If elements are accessed without index (using array to pass or return several values) or a loop is fully unrolled we can SR it:

x0 = a[0];
x1 = a[1];
x2 = a[2];

Regards,
Vladimir

On 9/13/16 12:55 PM, Ruslan Cheremin wrote:
>>There was also another thread a few months back where I was asking why a small local array allocation wasn't scalarized, and the answer there was ordering between loop unrolling and EA passes (I can
> dig up that thread if you're interested).
>
> It would be very nice, please -- I've tried to google it by myself (because you've noted it already in the thread) but wasn't able to guess right keywords :)
>
>
> 2016-09-13 22:44 GMT+03:00 Vitaly Davidovich <vitalyd at gmail.com <mailto:vitalyd at gmail.com>>:
>
>
>
>     On Tue, Sep 13, 2016 at 3:32 PM, Ruslan Cheremin <cheremin at gmail.com <mailto:cheremin at gmail.com>> wrote:
>
>         >how it can be made stable to the point where you can rely/depend on it for performance.
>
>         Well, same can be said about any JIT optimization -- (may be it is time to rename dynamic runtime to stochastic runtime?). Personally I see SR to be the same order of stability as inlining.
>         Actually, apart from few SR-specific issues (like with merge points), EA/SR mostly follow inlining: if you have enough scope inlined you'll have, say, 80% chance of SR. From my perspective it
>         is inlining which is so surprisingly unstable.
>
>     Yeah, I'd agree.  The difference, in my mind, is failing to inline a function may not have as drastic performance implications as failing to eliminate temporaries.
>
>
>         BTW: have you considered to share you experience with EA/SR pitfalls? Even if "increase likelihood" is the best option available -- there are still very little information about it in the net.
>
>     I'm kind of doing that via the few emails on this list :).  I think you pretty much covered the biggest (apparent) flake in the equation - inlining, which can fail for all sorts of different
>     reasons.  Beyond that, there's the control flow insensitive aspect of the EA, which is tangentially related to inlining (or lack thereof).
>
>     There was also another thread a few months back where I was asking why a small local array allocation wasn't scalarized, and the answer there was ordering between loop unrolling and EA passes (I
>     can dig up that thread if you're interested).  The bizarre thing there was the loop operation was folded into a constant, and the compiled method was returning a constant value, but the array
>     allocation was left behind (although it wasn't needed).
>
>     I agree that there isn't much information about EA in Hotspot (there's a lot of handwaving and inaccuracies online).  In particular, it'd be nice if the performance wiki had a section on making
>     user code play well with EA (just like it has guidance on some other JIT aspects currently).
>
>
>         ----
>         Ruslan
>
>
>
>         2016-09-13 21:33 GMT+03:00 Vitaly Davidovich <vitalyd at gmail.com <mailto:vitalyd at gmail.com>>:
>
>
>
>             On Tue, Sep 13, 2016 at 2:25 PM, Ruslan Cheremin <cheremin at gmail.com <mailto:cheremin at gmail.com>> wrote:
>
>                 >That's my understanding as well (and matches what I'm seeing in some synthetic test harnesses).
>
>                 Ok, I just tried to clear it out, because it is not the first time I see BCEA... noted in context of scalar replacement, and I start to doubt my eyes :)
>
>                 >t's pretty brittle, sadly, and more importantly, unstable.
>
>                 Making similar experiments I see the same. E.g. HashMap.get(TupleKey) lookup can be successfully scalarized 99% cases, but scalarization become broken once with slightly changed key
>                 generation schema -- because hashcodes distribution becomes worse, and HashMap buckets start to convert themself to TreeBins, and TreeBins code is much harder task for EA.
>
>                 Another can of worms is mismatch between different inlining heuristics. E.g. FreqInlineSize and InlineSmallCode thresholds may give different decision for the same piece of code, and
>                 taken inlining decision depends on was method already compiled or not -- which depends on thinnest details of initialization order and execution profile. This scenarios becomes rare in
>                 1.8 with InlineSmallCode increased, but I'm not sure they are gone...
>
>                 Currently, I'm starting to think code needs to be specifically written for EA/SR in mind to be more-or-less stably scalarized. I.e. you can't get it for free (or it will be unstable).
>
>             I'm not sure this is practical, to be honest, at least for a big enough application.  I've long considered EA (and scalar replacement) as a bonus optimization, and never to rely on it if
>             the allocations would hurt otherwise.  I'm just a bit surprised *just* how unstable it appears to be, in the "simplest" of cases.
>
>             I think code can be written to increase likelihood of scalar replacement, but I just can't see how it can be made stable to the point where you can rely/depend on it for performance.
>
>
>                 ----
>                 Ruslan
>
>
>                 2016-09-13 20:51 GMT+03:00 Vitaly Davidovich <vitalyd at gmail.com <mailto:vitalyd at gmail.com>>:
>
>
>
>                     On Tuesday, September 13, 2016, Cheremin Ruslan <cheremin at gmail.com <mailto:cheremin at gmail.com>> wrote:
>
>                         > I'm seeing some code that iterates over a ConcurrentHashMap's entrySet that allocates tens of GB of CHM$MapEntry objects even though they don't escape
>
>
>                         I'm a bit confused: I was sure BCEA-style params do affect EA, but don't affect scalar replacement. With bcEscapeAnalyser you can get (sort of) inter-procedural EA, but this
>                         only allows you to have more allocations identified as ArgEscape instead of GlobalEscape. But you can't get more NoEscape without real inlining. ArgEscape (afaik) is used only
>                         for synchronization removals in HotSpot, not for scalar replacements.
>
>                         Am I incorrect?
>
>                     That's my understanding as well (and matches what I'm seeing in some synthetic test harnesses).
>
>                     I'm generally seeing a lot of variability in scalar replacement in particular, all driven by profile data.  HashMap<Integer, ...>::get(int) sometimes works at eliminating the box
>                     and sometimes doesn't - the difference appears to be whether Integer::equals is inlined or not, which in turn depends on whether the lookup finds something or not and whether the
>                     number of successful lookups reaches compilation threshold. It's pretty brittle, sadly, and more importantly, unstable.
>
>
>
>                         ----
>                         Ruslan
>
>
>
>                     --
>                     Sent from my phone
>
>
>
>
>
>

From vitalyd at gmail.com  Tue Sep 13 20:29:13 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Tue, 13 Sep 2016 16:29:13 -0400
Subject: MaxBCEAEstimateSize and inlining clarification
In-Reply-To: <CAOwENiKnhQBBDERv8ah5eT8b2mGdhO0EQ9LsmtNBYr4oyB+vDA@mail.gmail.com>
References: <00C16B65-A85F-491E-9384-1172735D9952@gmail.com>
	<CAHjP37Es_HUcbSHvySoCJqb7-hNjtLSCxyHi6C6rmY7gqqVOnw@mail.gmail.com>
	<CAOwENiKkrARDOur8gMbit_zrxQDHcw+ABEg29RDvvMueD5xzbA@mail.gmail.com>
	<CAHjP37G5tD8in2o42-KwUffUQhGeueL6FkAfLcpbHB6zA8bf2Q@mail.gmail.com>
	<CAOwENi+Ko64mUojCW_qR=n_J+N2WOjxb6KdrHLJ5vF_Wyqi+Sw@mail.gmail.com>
	<CAHjP37Hi66O+xZ8JKKEkReOODUqCx9iw_cNtXDY6dAUZEW7icw@mail.gmail.com>
	<CAOwENiKnhQBBDERv8ah5eT8b2mGdhO0EQ9LsmtNBYr4oyB+vDA@mail.gmail.com>
Message-ID: <CAHjP37FQC54aK5eH0r2PsgbCuZurkhvTzqBvfSem0JxLfJ0+fw@mail.gmail.com>

On Tue, Sep 13, 2016 at 3:55 PM, Ruslan Cheremin <cheremin at gmail.com> wrote:

> >There was also another thread a few months back where I was asking why a
> small local array allocation wasn't scalarized, and the answer there was
> ordering between loop unrolling and EA passes (I can dig up that thread if
> you're interested).
>
> It would be very nice, please -- I've tried to google it by myself
> (because you've noted it already in the thread) but wasn't able to guess
> right keywords :)
>
http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-December/020546.html


>
>
> 2016-09-13 22:44 GMT+03:00 Vitaly Davidovich <vitalyd at gmail.com>:
>
>>
>>
>> On Tue, Sep 13, 2016 at 3:32 PM, Ruslan Cheremin <cheremin at gmail.com>
>> wrote:
>>
>>> >how it can be made stable to the point where you can rely/depend on it
>>> for performance.
>>>
>>> Well, same can be said about any JIT optimization -- (may be it is time
>>> to rename dynamic runtime to stochastic runtime?). Personally I see SR to
>>> be the same order of stability as inlining. Actually, apart from few
>>> SR-specific issues (like with merge points), EA/SR mostly follow inlining:
>>> if you have enough scope inlined you'll have, say, 80% chance of SR.
>>> From my perspective it is inlining which is so surprisingly unstable.
>>>
>> Yeah, I'd agree.  The difference, in my mind, is failing to inline a
>> function may not have as drastic performance implications as failing to
>> eliminate temporaries.
>>
>>>
>>> BTW: have you considered to share you experience with EA/SR pitfalls?
>>> Even if "increase likelihood" is the best option available -- there are
>>> still very little information about it in the net.
>>>
>> I'm kind of doing that via the few emails on this list :).  I think you
>> pretty much covered the biggest (apparent) flake in the equation -
>> inlining, which can fail for all sorts of different reasons.  Beyond that,
>> there's the control flow insensitive aspect of the EA, which is
>> tangentially related to inlining (or lack thereof).
>>
>> There was also another thread a few months back where I was asking why a
>> small local array allocation wasn't scalarized, and the answer there was
>> ordering between loop unrolling and EA passes (I can dig up that thread if
>> you're interested).  The bizarre thing there was the loop operation was
>> folded into a constant, and the compiled method was returning a constant
>> value, but the array allocation was left behind (although it wasn't needed).
>>
>> I agree that there isn't much information about EA in Hotspot (there's a
>> lot of handwaving and inaccuracies online).  In particular, it'd be nice if
>> the performance wiki had a section on making user code play well with EA
>> (just like it has guidance on some other JIT aspects currently).
>>
>>>
>>> ----
>>> Ruslan
>>>
>>>
>>>
>>> 2016-09-13 21:33 GMT+03:00 Vitaly Davidovich <vitalyd at gmail.com>:
>>>
>>>>
>>>>
>>>> On Tue, Sep 13, 2016 at 2:25 PM, Ruslan Cheremin <cheremin at gmail.com>
>>>> wrote:
>>>>
>>>>> >That's my understanding as well (and matches what I'm seeing in some
>>>>> synthetic test harnesses).
>>>>>
>>>>> Ok, I just tried to clear it out, because it is not the first time I
>>>>> see BCEA... noted in context of scalar replacement, and I start to doubt my
>>>>> eyes :)
>>>>>
>>>>> >t's pretty brittle, sadly, and more importantly, unstable.
>>>>>
>>>>> Making similar experiments I see the same. E.g. HashMap.get(TupleKey)
>>>>> lookup can be successfully scalarized 99% cases, but scalarization become
>>>>> broken once with slightly changed key generation schema -- because
>>>>> hashcodes distribution becomes worse, and HashMap buckets start to convert
>>>>> themself to TreeBins, and TreeBins code is much harder task for EA.
>>>>>
>>>>> Another can of worms is mismatch between different inlining
>>>>> heuristics. E.g. FreqInlineSize and InlineSmallCode thresholds may give
>>>>> different decision for the same piece of code, and taken inlining decision
>>>>> depends on was method already compiled or not -- which depends on thinnest
>>>>> details of initialization order and execution profile. This scenarios
>>>>> becomes rare in 1.8 with InlineSmallCode increased, but I'm not sure they
>>>>> are gone...
>>>>>
>>>>> Currently, I'm starting to think code needs to be specifically written
>>>>> for EA/SR in mind to be more-or-less stably scalarized. I.e. you can't get
>>>>> it for free (or it will be unstable).
>>>>>
>>>> I'm not sure this is practical, to be honest, at least for a big enough
>>>> application.  I've long considered EA (and scalar replacement) as a bonus
>>>> optimization, and never to rely on it if the allocations would hurt
>>>> otherwise.  I'm just a bit surprised *just* how unstable it appears to be,
>>>> in the "simplest" of cases.
>>>>
>>>> I think code can be written to increase likelihood of scalar
>>>> replacement, but I just can't see how it can be made stable to the point
>>>> where you can rely/depend on it for performance.
>>>>
>>>>>
>>>>> ----
>>>>> Ruslan
>>>>>
>>>>>
>>>>> 2016-09-13 20:51 GMT+03:00 Vitaly Davidovich <vitalyd at gmail.com>:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Tuesday, September 13, 2016, Cheremin Ruslan <cheremin at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> > I'm seeing some code that iterates over a ConcurrentHashMap's
>>>>>>> entrySet that allocates tens of GB of CHM$MapEntry objects even though they
>>>>>>> don't escape
>>>>>>>
>>>>>>>
>>>>>>> I'm a bit confused: I was sure BCEA-style params do affect EA, but
>>>>>>> don't affect scalar replacement. With bcEscapeAnalyser you can get (sort
>>>>>>> of) inter-procedural EA, but this only allows you to have more allocations
>>>>>>> identified as ArgEscape instead of GlobalEscape. But you can't get more
>>>>>>> NoEscape without real inlining. ArgEscape (afaik) is used only for
>>>>>>> synchronization removals in HotSpot, not for scalar replacements.
>>>>>>>
>>>>>>> Am I incorrect?
>>>>>>
>>>>>> That's my understanding as well (and matches what I'm seeing in some
>>>>>> synthetic test harnesses).
>>>>>>
>>>>>> I'm generally seeing a lot of variability in scalar replacement in
>>>>>> particular, all driven by profile data.  HashMap<Integer, ...>::get(int)
>>>>>> sometimes works at eliminating the box and sometimes doesn't - the
>>>>>> difference appears to be whether Integer::equals is inlined or not, which
>>>>>> in turn depends on whether the lookup finds something or not and whether
>>>>>> the number of successful lookups reaches compilation threshold. It's pretty
>>>>>> brittle, sadly, and more importantly, unstable.
>>>>>>
>>>>>>
>>>>>>
>>>>>>> ----
>>>>>>> Ruslan
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Sent from my phone
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160913/c140ce7e/attachment-0001.html>

From doug.simon at oracle.com  Tue Sep 13 22:33:59 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Wed, 14 Sep 2016 00:33:59 +0200
Subject: RFR: 8165434: [JVMCI] remove uses of setAccessible
In-Reply-To: <21860311-D6E9-482B-B0A0-F488A516A1D3@oracle.com>
References: <BA873BD3-C571-43B2-B41F-EE9FA35FE926@oracle.com>
	<864558C5-C2AD-4D6B-BB6F-568F00BBE28A@twitter.com>
	<6224CDA0-63E6-442C-BD13-732208FA75A2@oracle.com>
	<999A422E-6CF6-45C5-955B-D58745DBB456@twitter.com>
	<21860311-D6E9-482B-B0A0-F488A516A1D3@oracle.com>
Message-ID: <9B5CFF51-7C4E-44E8-B743-B37411E3C77C@oracle.com>

JPRT testing revealed a test bug in FindUniqueConcreteMethodTest.java where CompileToVM.findUniqueConcreteMethod was being called with a default method. This is not supported by HotSpot which is why the only other usage of this private API avoids it:

http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/ec36e3e03d65/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java#l385

The offending test in FindUniqueConcreteMethodTest has been commented out.

-Doug

> On 08 Sep 2016, at 15:12, Doug Simon <doug.simon at oracle.com> wrote:
> 
>> 
>> On 07 Sep 2016, at 19:52, Christian Thalinger <cthalinger at twitter.com> wrote:
>> 
>>> 
>>> On Sep 7, 2016, at 2:29 AM, Doug Simon <doug.simon at oracle.com> wrote:
>>> 
>>>> 
>>>> On 06 Sep 2016, at 20:12, Christian Thalinger <cthalinger at twitter.com> wrote:
>>>> 
>>>> 
>>>>> On Sep 5, 2016, at 6:45 AM, Doug Simon <doug.simon at oracle.com> wrote:
>>>>> 
>>>>> JVMCI currently uses java.lang.reflect.AccessibleObject.setAccessible to get at private internals of certain JDK objects (e.g. java.lang.reflect.Method::slot). In light of changes around java.lang.reflect.AccessibleObject::setAccessible at http://openjdk.java.net/jeps/261, this may require extra command line options at some point. To avoid that, I?ve removed all uses of setAccessible in JVMCI.
>>>>> 
>>>>> http://cr.openjdk.java.net/~dnsimon/8165434/
>>>> 
>>>> src/jdk.vm.ci/share/classes/jdk.vm.ci.meta/src/jdk/vm/ci/meta/ModifiersProvider.java
>>>> 
>>>> +    int BRIDGE = 0x0040;
>>>> +    int VARARGS = 0x0080;
>>>> +    int SYNTHETIC = 0x1000;
>>>> +    int ANNOTATION = 0x2000;
>>>> +    int ENUM = 0x4000;
>>>> I wish we could avoid that.  We can?t use this stuff because it?s HotSpot-dependent, right?
>>>> +        assert ModifiersProvider.SYNTHETIC == getConstant("JVM_ACC_SYNTHETIC", Integer.class);
>>>> +        assert ModifiersProvider.ANNOTATION == getConstant("JVM_ACC_ANNOTATION", Integer.class);
>>>> +        assert ModifiersProvider.BRIDGE == getConstant("JVM_ACC_BRIDGE", Integer.class);
>>>> +        assert ModifiersProvider.VARARGS == getConstant("JVM_ACC_VARARGS", Integer.class);
>>>> +        assert ModifiersProvider.ENUM == getConstant("JVM_ACC_ENUM", Integer.class);
>>>> What if we convert these constants to interface methods and the VM-dependent part has to implement them?  Or maybe even keep the fields and assign them via interface methods.
>>> 
>>> Following your suggestion, I?ve factored out these VM dependent flags to a new HotSpotModifiers class:
>>> 
>>> http://cr.openjdk.java.net/~dnsimon/8165434.v2/
>> 
>> Excellent.  One question? I noticed HotSpotModifiers is an interface but no other class implements it.  Is there a reason for it being an interface?
> 
> Nope. It?s now a class.
> 
>> 
>> Only nit, remove 2011:
>>   2  * Copyright (c) 2011, 2016, Oracle and/or its affiliates. All rights reserved.
> 
> Fixed.
> 
> -Doug


From goetz.lindenmaier at sap.com  Wed Sep 14 06:29:09 2016
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Wed, 14 Sep 2016 06:29:09 +0000
Subject: please sponsor? RFR(M): 8165235: [TESTBUG] RTM tests must check
	OS version
In-Reply-To: <57D858F8.1010807@oracle.com>
References: <d509c4eeb57949c88f4b3df0b6f35409@DEWDFE13DE50.global.corp.sap>
	<57D858F8.1010807@oracle.com>
Message-ID: <fcd61fe8541b432390376ddba85b575b@DEWDFE13DE50.global.corp.sap>

Hi Vladimir,

Thanks a lot!
  Best regards,
  Goetz

> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Dienstag, 13. September 2016 21:52
> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; hotspot-compiler-
> dev at openjdk.java.net
> Subject: Re: please sponsor? RFR(M): 8165235: [TESTBUG] RTM tests must
> check OS version
> 
> Submitted to JPRT.
> 
> Thanks,
> Vladimir
> 
> On 9/8/16 7:38 AM, Lindenmaier, Goetz wrote:
> > Hi,
> >
> > This change was reviewed by Volker Simonis and Fillipp Zhinkin.
> > Final webrevs:
> > http://cr.openjdk.java.net/~goetz/wr16/8165235-osRecog/03/webrev.bs/
> > http://cr.openjdk.java.net/~goetz/wr16/8165235-osRecog/03/webrev.hs/
> >
> > Could someone please sponsor?
> >
> > Thanks!
> >    Goetz
> >
> >> -----Original Message-----
> >> From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-
> >> bounces at openjdk.java.net] On Behalf Of Lindenmaier, Goetz
> >> Sent: Montag, 5. September 2016 13:55
> >> To: hotspot-compiler-dev at openjdk.java.net
> >> Subject: RFR(M): 8165235: [TESTBUG] RTM tests must check OS version
> >>
> >> Hi,
> >>
> >>
> >>
> >> This fixes the RTM tests wrt. to supported platforms on ppc.
> >>
> >> Please review this change. I please need a sponsor.
> >> http://cr.openjdk.java.net/~goetz/wr16/8165235-
> osRecog/01/webrev.bs/
> >>
> >> http://cr.openjdk.java.net/~goetz/wr16/8165235-
> osRecog/01/webrev.hs/
> >>
> >>
> >> RTM uses special instructions that are only available on recent x86 cpus.
> On
> >> x86, this feature does not need OS support. On ppc, the equivalent
> >> functionality, hardware transactional memory, requires OS support. Thus
> the
> >> feature is only enabled by the VM if CPU and OS are at a specific level. The
> >> tests must check this. too. This holds for AIX and Linux.
> >>
> >>
> >>
> >> To do so, this change introduces rtm/predicate/SupportedOS.java which
> >> checks for proper OS versions on ppc, else returns true.
> >>
> >> The OS version is retrieved from Platform.java, which has new methods
> >> getOsVersionMajor() and  getOsVersionMinor().
> >>
> >> To simplify the checks in the tests, I also introduced a 3-way AndPredicate
> >> constructor.
> >>
> >>
> >>
> >> To simplify the OS version check on Aix, I change enabling RTM on Aix to
> >> require AIX 7.2.
> >>
> >> Before, it was enabled on AIX 7.1.3.30, which contains an important bug
> fix.
> >> The
> >>
> >> last digits of this version are not exported to os.version property, so I can
> not
> >>
> >> check for them in the test.
> >>
> >>
> >>
> >> Best regards,
> >>
> >>    Goetz.
> >

From vladimir.x.ivanov at oracle.com  Wed Sep 14 10:12:35 2016
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Wed, 14 Sep 2016 13:12:35 +0300
Subject: Odd interaction between ArrayList$Itr and Escape Analysis
In-Reply-To: <CA+cQ+tRJDXRBC2M0_LfCf=XwWs7HrxJKjhsQpSVNW9hjyCY58A@mail.gmail.com>
References: <CAHjP37HcvD9JkhwrAjkNpV+vawyc1A=QgYD51SMLrLnJ90J4rw@mail.gmail.com>
	<CA+cQ+tQEeRDKsn0VffLfhXcbUqYd_vehtK9sEussoOkmTux4cA@mail.gmail.com>
	<CAHjP37HKRe9xhhkaWG--t8s=m7ZDvDu57Gt0qCEzi8_Q9K7dTA@mail.gmail.com>
	<CA+cQ+tT4Rx4B_Wr=D0YvmNG4YozsiJRC-tgasv_yTv-M==kYnA@mail.gmail.com>
	<CAHjP37GJQ02mMOT-TtFsoOhRJcjdd08CKp3WxpjYBnYri=7bdw@mail.gmail.com>
	<1619527975.952230.1473776309365.JavaMail.zimbra@u-pem.fr>
	<CAHjP37GGLcLOiWjPPsdMS85PvDcncFg9k9D5dDkCLnnq7ij=FA@mail.gmail.com>
	<CA+cQ+tRJDXRBC2M0_LfCf=XwWs7HrxJKjhsQpSVNW9hjyCY58A@mail.gmail.com>
Message-ID: <4c873846-5322-ebdf-5e0a-393089aea590@oracle.com>

Kris,

> And I'm happy to upstream that patch, if the team is interested.

Sure, we are definitely interested in fixing that. Feel free to file a 
bug and send the fix out for review.

> Now, when I first discovered the problem, my first intuition was that
> it's better to "fix" it in javac. But before nest mates in the Class
> file, there isn't much that javac could do. Changing the Java libraries
> to not use private constructors in inner classes is also doable, but
> needs changing a lot of files.

I agree that javac is not the best place to fix the immediate problem: 
it requires recompilation and there are already lots of problematic 
bytecode shapes out in the wild. The JVM should optimize for that case 
instead.

> So I ended up fixing it in the VM, even though I agree fully with what
> R?mi brought up.

I'm curious how did you fix it. I haven't found a description in the 
thread.

It's possible to force class loading, but I'm worried about undesirable 
effects of class initialization. Is it enough for C2 to have the class 
loaded but not initialized to make it work?

Another approach would be to issue a null check and deoptimize (for 
bridge methods, the check collapses after inlining since the argument is 
always null) or add a nmethod dependency and throw away the code when 
the parameter class is loaded.

Best regards,
Vladimir Ivanov

> The access constructor tag thingy in javac is really a weird hack. If
> you guys ever look at the contents of ArrayList$1, it's really empty
> -- the class doesn't even declare some of the usual structures in a
> normal Class file... Hopefully we can get rid of it in javac soon.

> On Tuesday, September 13, 2016, Vitaly Davidovich <vitalyd at gmail.com
> <mailto:vitalyd at gmail.com>> wrote:
>
>
>
>     On Tuesday, September 13, 2016, Remi Forax <forax at univ-mlv.fr
>     <javascript:_e(%7B%7D,'cvml','forax at univ-mlv.fr');>> wrote:
>
>         I've always found that the empty inner classes generated by
>         javac as a kind of hack.
>
>         These classes should be removed in Java 10, thanks to the
>         nestmate attributes.
>
>         http://mail.openjdk.java.net/pipermail/valhalla-spec-experts/2016-January/000060.html
>         <http://mail.openjdk.java.net/pipermail/valhalla-spec-experts/2016-January/000060.html>
>
>         The other solution, is to have an empty class in the jdk which
>         is not visible from javac (the class itself can be marked as
>         synthetic),
>         so javac can use it without creating method clash.
>
>         and to solve the problem now, the easy solution is to add a
>         package private constructor in ArrayList.Itr,
>
>     I'm hoping Oracle can take Kris' (Azul) patch (or do something
>     similar).  It might catch more cases than just modifying Itr.
>
>
>         private class Itr implements Iterator<E> {
>         int cursor; // index of next element to return
>         int lastRet = -1; // index of last element returned; -1 if no such
>         int expectedModCount = modCount;
>
>         Itr() {
>         // avoid to generate a synthetic accessor constructor
>         }
>         }
>
>
>         regards,
>         R?mi
>
>         ------------------------------------------------------------------------
>
>             *De: *"Vitaly Davidovich" <vitalyd at gmail.com>
>             *?: *"Krystal Mok" <rednaxelafx at gmail.com>
>             *Cc: *"hotspot compiler" <hotspot-compiler-dev at openjdk.java.net>
>             *Envoy?: *Lundi 12 Septembre 2016 22:15:41
>             *Objet: *Re: Odd interaction between ArrayList$Itr and
>             Escape Analysis
>
>
>
>             On Mon, Sep 12, 2016 at 3:56 PM, Krystal Mok
>             <rednaxelafx at gmail.com> wrote:
>
>                 On Mon, Sep 12, 2016 at 12:38 PM, Vitaly Davidovich
>                 <vitalyd at gmail.com> wrote:
>
>                     It seems odd to me as well why inlining won't force
>                     load the missing class(es).  If we're inlining, it
>                     means the method itself or the call chain it's part
>                     of is hot - failing to inline can have negative
>                     side-effects, like this example.  I suppose there
>                     must be a good reason why it doesn't do this though?
>
>
>                 That's because we can't. The JIT compilers are running
>                 on their own threads, and they're not real "Java
>                 threads". So they are not allowed to run arbitrary Java
>                 code. But Java class loading may involve running
>                 arbitrary Java code, e.g. the ClassLoader.loadClass()
>                 upcall.
>                 Force class loading can be done on the triggering side
>                 (for the top-level method), because compilation tasks
>                 are triggered from real Java threads, and they're
>                 allowed to run arbitrary Java code.
>
>             I see, makes sense.  Perhaps there can be an option to turn
>             on loading of required types in the entire compilation unit,
>             after all inlining is done (and therefore make the unloaded
>             types not be barriers for inlining).  I'd personally prefer
>             that over having odd performance differences.
>
>
>                 - Kris
>
>             Thanks Kris.
>
>
>
>     --
>     Sent from my phone
>

From zoltan.majo at oracle.com  Wed Sep 14 12:19:15 2016
From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=)
Date: Wed, 14 Sep 2016 14:19:15 +0200
Subject: RFR(S): 8159611: C2: ArrayCopy elimination skips required
	parameter checks
In-Reply-To: <8e399624-8e67-ebe6-d348-7691690532e8@oracle.com>
References: <CA+3eh10fWqYQbL-eKEk=ud26pR5C1hM6Z_S0=ROJe-TnipKcNQ@mail.gmail.com>
	<57B2A380.6000408@oracle.com>
	<CA+3eh10yxdunPPYRptfTghA6LKRc5iO+VU=EsFr2PJgQmexQag@mail.gmail.com>
	<41851a79-5ffe-2b9d-504a-6a2301de5384@oracle.com>
	<CA+3eh12Jmcnt83YZ3cWaX1+JdSvY2pTNaLC=uQSLmK-6oGMZHA@mail.gmail.com>
	<7ce01d28-13f5-098a-9898-080f8258881d@oracle.com>
	<CA+3eh11qwNSwAdtxq1QyqRFcHpLsLSnKFSRH_5Vx8Zi8u5Ci_A@mail.gmail.com>
	<8e399624-8e67-ebe6-d348-7691690532e8@oracle.com>
Message-ID: <af67d9ed-2466-d38a-49f1-e179739e8112@oracle.com>

Hi Volker,


On 09/13/2016 05:04 PM, Zolt?n Maj? wrote:
> Hi Volker,
>
>
> On 09/12/2016 06:35 PM, Volker Simonis wrote:
>> Sorry for the long delay...
>
> thank you for spending more time on this bug and also for the detailed 
> description of the way your solution works!
>
>>
>> Here's my new version:
>>
>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8159611.v3/
>
> That looks good to me.
>
> I did a preliminary performance evaluation with Octane-Gbemu and 
> Octane-PdfJS, results look good on all platforms.  Let me now do a 
> more detailed evaluation.  I'll get back to you once the results are 
> available.

The performance evaluation with webrev.v3 is complete now. The change 
does not cause any performance regressions (neither for SPECjvm2008 nor 
for Octane). Once you update the code according to Vladimir's 
suggestions, I can look again.

Thank you!

Best regards,


Zoltan

>
> Thank you!
>
> Best regards,
>
>
> Zoltan
>
>>
>> I've actually changed PhaseMacroExpand::expand_arraycopy_node() such
>> that it calls generate_arraycopy() with 'length_never_negative' set to
>> true if EliminateAllocations is true (in this case we already checked
>> in LibraryCallKit::inline_arraycopy() that 'length' is not negative).
>> This way I could leave generate_arraycopy() untouched.
>>
>> The generated code now looks as follows:
>>
>> Original version (without 'length < 0' check):
>>
>> 0a7   B5: #    B17 B6 <- B4  Freq: 0,999998
>> 0a7       cmpl    R9, R11    # unsigned
>> 0aa       jb,u  B17  P=0,000001 C=-1,000000
>> ...
>> 0da   B7: #    B18 B8 <- B6 B12 B13  Freq: 0,999997
>> 0da       movl    R11, [rsp + #8]    # spill
>> 0df       testl   R11, R11
>> 0e2       jle     B18  P=0,000001 C=-1,000000
>> ...
>> 0e8   B8: #    B9 <- B7  Freq: 0,999996
>> 0f9       call_leaf_nofp,runtime  oop_disjoint_arraycopy
>> ...
>> 106   B9: #    B10 <- B8 B18 B20  Freq: 0,999997
>> 113       ret
>> ...
>> 184   B17: #    N1 <- B4 B5  Freq: 2,01328e-06
>> 193       call,static  wrapper for:
>> uncommon_trap(reason='intrinsic_or_type_checked_inlining'
>> action='make_not_entrant' debug_id='0')
>>
>> 19d   B18: #    B9 B19 <- B7  Freq: 9,99997e-07
>> 19d       testl   R11, R11
>> 1a0       jge     B9  P=0,999999 C=-1,000000
>> 1a0
>> 1a6   B19: #    B22 B20 <- B18  Freq: 9,99997e-13
>> 1a6       movq    RSI, R8    # spill
>> 1a9       movl    RDX, #1    # int
>> 1ae       movq    RCX, R10    # spill
>> 1b1       movl    R8, #1    # int
>> 1b7       movl    R9, R11    # spill
>>            nop     # 1 bytes pad for loops and calls
>> 1bb       call,static  wrapper for: slow_arraycopy
>>
>> In B5 there's a check if 'offset+length' is still in the array range.
>> If not we jump to the uncommon trap in B17.
>> In B7 there's the first check from
>> PhaseMacroExpand::generate_arraycopy() (i.e.
>> generate_nonpositive_guard()). If 'length is less than or equal to
>> zero we jump to B18 where there's the second check from
>> PhaseMacroExpand::generate_arraycopy() (i.e.
>> generate_negative_guard()). If 'length' is  zero, we jump to B9 and
>> return. Otherwise we fall into B19 from where we call slow_arraycopy.
>> slow_arraycopy (which is generated in ObjArrayKlass::copy_array() will
>> throw an AIOOB exception if 'length' is negative.
>>
>> The new version now looks as follows:
>>
>> 0a2   B5: #    B19 B6 <- B4  Freq: 0,999998
>> 0a2       cmpl    R10, RCX    # unsigned
>> 0a5       jb,u  B19  P=0,000001 C=-1,000000
>> 0a5
>> 0ab   B6: #    B20 B7 <- B5  Freq: 0,999997
>> 0ab       movl    R10, [rsp + #0]    # spill
>> 0af       testl   R10, R10
>> 0b2       jl     B20  P=0,000001 C=-1,000000
>> 0b2
>> ...
>> 0e2   B8: #    B10 B9 <- B7 B13 B14  Freq: 0,999996
>> 0e2       testl   R10, R10
>> 0e5       je,s   B10  P=0,000001 C=-1,000000
>> ...
>> 0e7   B9: #    B10 <- B8  Freq: 0,999995
>> 0f8       call_leaf_nofp,runtime  oop_disjoint_arraycopy
>> ...
>> 105   B10: #    B11 <- B9 B8  Freq: 0,999996
>> 112       ret
>> ...
>> 18e   B19: #    B20 <- B5  Freq: 9,99998e-07
>> 192   B20: #    N1 <- B18 B19 B6  Freq: 3,01327e-06
>> 1a3       call,static  wrapper for:
>> uncommon_trap(reason='intrinsic_or_type_checked_inlining'
>> action='make_not_entrant' debug_id='0')
>>
>> B5 is like before, but is now followed by the extra check for 'length'
>> being not negative in B6. In B8 we we now have the first check (i.e.
>> generate_negative_guard()) from
>> PhaseMacroExpand::generate_arraycopy(). It directly checks if 'length'
>> is zero and jumps to B10 (i.e. returns) if so. Otherwise we fall
>> directly into oop_disjoint_arraycopy(). There's no need to check for
>> 'length' being negative and calling 'slow_arraycopy' because this case
>> is already handled before now (in B6).
>>
>> Is this OK now?
>>
>> Thank you and best regards,
>> Volker
>>
>>
>> On Fri, Aug 26, 2016 at 3:51 AM, Vladimir Kozlov
>> <vladimir.kozlov at oracle.com> wrote:
>>> Looks good.
>>>
>>> Check does not fold because it is different: LT vs LE.
>>>
>>> Actually there are 3 checks together with yours (see
>>> PhaseMacroExpand::generate_arraycopy()):
>>>
>>>    Node* not_pos = generate_nonpositive_guard(ctrl, copy_length,
>>> length_never_negative);
>>>    if (not_pos != NULL) {
>>>      Node* local_ctrl = not_pos, *local_io = *io;
>>>      MergeMemNode* local_mem = MergeMemNode::make(mem);
>>>      transform_later(local_mem);
>>>
>>>      // (6) length must not be negative.
>>>      if (!length_never_negative) {
>>>        generate_negative_guard(&local_ctrl, copy_length, slow_region);
>>>      }
>>>
>>> I think the only way to avoid this is to modify code in 
>>> generate_arraycopy()
>>> when EliminateAllocations is true. In such case you need to generate 
>>> only
>>> length == 0 check.
>>>
>>> Thanks,
>>> Vladimir
>>>
>>>
>>> On 8/25/16 10:03 AM, Volker Simonis wrote:
>>>> On Tue, Aug 16, 2016 at 11:49 PM, Vladimir Kozlov
>>>> <vladimir.kozlov at oracle.com> wrote:
>>>>> Not generating exception is definitely bug.
>>>>>
>>>>> First, about test case. It would be nice if it also verifies other
>>>>> IndexOutOfBoundsException cases.
>>>>>
>>>> I've extended the test case. See:
>>>>
>>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8159611.v2/
>>>>
>>>> With the new test I've caught another problem in C1 (only on x86 and
>>>> s390, but that's not in the OpenJDK yet :).
>>>>
>>>> LIR_Assembler::emit_arraycopy() had a shortcut for length==0 which
>>>> prevented the throwing of an ArrayStoreException if src and dst arrays
>>>> have incompatible type (see do_test2() in the new regression test).
>>>> Note that this is a different error from 8160591 and not fixed by the
>>>> change for 8160591.
>>>>
>>>> I've also moved the new check after the offset + length check as
>>>> suggested by you (see new webrev).
>>>>
>>>> Unfortunately, the new check is still not eliminated. Here's how it 
>>>> looks:
>>>>
>>>> 0ae   B6: #    B20 B7 <- B5  Freq: 0,999997
>>>> 0ae       movl    R9, [rsp + #0]    # spill
>>>> 0b2       testl   R9, R9
>>>> 0b5       jl     B20  P=0,000001 C=-1,000000
>>>> 0b5
>>>> 0bb   B7: #    B12 B8 <- B6  Freq: 0,999996
>>>> 0bb       movl    R11, [R10 + #8 (8-bit)]    # compressed klass ptr
>>>> 0bf       decode_klass_not_null RAX,R11
>>>> 0cc       movl    RBX, [RAX + #16 (8-bit)]    # int
>>>> 0cf       movslq  RCX, RBX    # i2l
>>>> 0d2       movq    RSI, precise klass [Ljava/lang/Object;:
>>>> 0x00007ff1080320d0:Constant:exact *    # ptr
>>>> 0dc       movq    RCX, [RSI + RCX]    # class
>>>> 0e0       cmpq    RAX, RCX    # ptr
>>>> 0e3       jne,us  B12  P=0,170000 C=-1,000000
>>>> 0e3
>>>> 0e5   B8: #    B21 B9 <- B7 B13 B14  Freq: 0,999996
>>>> 0e5       testl   R9, R9
>>>> 0e8       jle     B21  P=0,000001 C=-1,000000
>>>>
>>>> As you can see 'testl   R9, R9' is executed two times.
>>>>
>>>> I've even tried to move the new check after the subtype check, but
>>>> that doesn't helps either:
>>>>
>>>> 0da   B7: #    B20 B8 <- B6 B13 B14  Freq: 0,999997
>>>> 0da       movl    R11, [rsp + #8]    # spill
>>>> 0df       testl   R11, R11
>>>> 0e2       jl     B20  P=0,000001 C=-1,000000
>>>> 0e2
>>>> 0e8   B8: #    B10 B9 <- B7  Freq: 0,999996
>>>> 0e8       testl   R11, R11
>>>> 0eb       jle,s   B10  P=0,000001 C=-1,000000
>>>>
>>>> Any idea how this could be fixed?
>>>>
>>>> Thanks,
>>>> Volker
>>>>
>>>> PS: and I still don't have a reproducible benchmark which shows a
>>>> regression with my change...
>>>>
>>>>
>>>>> Actually additional dynamic check will help in case of negative 
>>>>> length is
>>>>> know during compilation. The allocation code will be eliminated very
>>>>> early
>>>>> instead of waiting macro expansion:
>>>>>
>>>>>        int length = 
>>>>> alloc->in(AllocateNode::ALength)->find_int_con(-1);
>>>>>        if (length < 0) {
>>>>>          NOT_PRODUCT(fail_eliminate = "Array's size is not 
>>>>> constant";)
>>>>>          can_eliminate = false;
>>>>>        }
>>>>>
>>>>> About additional length check in your new test. I think it may be
>>>>> collapsed
>>>>> with preceding check since it is generated after other checks.
>>>>> So I would suggest to move it after offset + length check.
>>>>>
>>>>> Thanks,
>>>>> Vladimir
>>>>>
>>>>>
>>>>> On 8/16/16 7:57 AM, Volker Simonis wrote:
>>>>>>
>>>>>> On Tue, Aug 16, 2016 at 7:24 AM, Tobias Hartmann
>>>>>> <tobias.hartmann at oracle.com> wrote:
>>>>>>>
>>>>>>> Hi Volker,
>>>>>>>
>>>>>>> thanks for taking care of this issue!
>>>>>>>
>>>>>>> Did you check what happens if the allocation is not eliminated and
>>>>>>> macro
>>>>>>> expansion phase emits another negative guard? Are the checks 
>>>>>>> merged?
>>>>>>>
>>>>>> It depends. I just saw that in some cases the regression test worked
>>>>>> before, because the length check was done in
>>>>>> SharedRuntime::slow_arraycopy_C(). So in that case there's obviously
>>>>>> nothing that can be merged. But the test case is obviously a
>>>>>> degenerated example anyway, so I don't think that's a problem.
>>>>>>
>>>>>> If I do a more real-world example like this where the arracopy 
>>>>>> can not
>>>>>> be eliminated because one of its arguments escapes:
>>>>>>
>>>>>>    public static boolean do_test2(int length, Object[] dest) {
>>>>>>      try {
>>>>>>        System.arraycopy(new Object[10], 1, dest, 1, length);
>>>>>>        return false;
>>>>>>      } catch (IndexOutOfBoundsException e) {
>>>>>>        return true;
>>>>>>      }
>>>>>>    }
>>>>>>
>>>>>> and call it with:
>>>>>>
>>>>>> do_test2(8, new Object[10])
>>>>>>
>>>>>> the generated code for do_test2() unfortunately contains one more
>>>>>> check now with my change (the 'length' field is in [rsp + #0]):
>>>>>>
>>>>>> 0a2   B4: #    B18 B5 <- B3  Freq: 0,999999
>>>>>> 0a2       movl    R9, [rsp + #0]    # spill
>>>>>> 0a6       testl   R9, R9
>>>>>> 0a9       jl     B18  P=0,000001 C=-1,000000
>>>>>> 0a9
>>>>>> 0af   B5: #    B18 B6 <- B4  Freq: 0,999998
>>>>>> 0af       movl    RBX, R9    # spill
>>>>>> 0b2       incl    RBX    # int
>>>>>> 0b4       cmpl    RBX, #10    # unsigned
>>>>>> 0b7       jnbe,u  B18  P=0,000001 C=-1,000000
>>>>>>
>>>>>> The generated code before my change looked like this (againthe
>>>>>> 'length' field is in [rsp + #0]):
>>>>>>
>>>>>> 0a1   B4: #    B17 B5 <- B3  Freq: 0,999999
>>>>>> 0a1       movl    R11, [rsp + #8]    # spill
>>>>>> 0a6       incl    R11    # int
>>>>>> 0a9       cmpl    R11, #10    # unsigned
>>>>>> 0ad       jnbe,u  B17  P=0,000001 C=-1,000000
>>>>>>
>>>>>> It seems that the 'length' check has been completely eliminated 
>>>>>> before.
>>>>>>
>>>>>> So I need to do some more tests to understand why the new check 
>>>>>> isn't
>>>>>> eliminated.
>>>>>>
>>>>>> Do you think the new check results in a performance regression? Have
>>>>>> you run some benchmarks?
>>>>>>
>>>>>>> I would prefer brackets around the if body but you don't need to 
>>>>>>> send
>>>>>>> another webrev:
>>>>>>>   if (EliminateAllocations) {
>>>>>>>     generate_negative_guard(length, slow_region);
>>>>>>>   }
>>>>>>
>>>>>>
>>>>>> Yes, I agree.
>>>>>>
>>>>>>> Best regards,
>>>>>>> Tobias
>>>>>>>
>>>>>>> On 12.08.2016 21:13, Volker Simonis wrote:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> can I please have a review and sponsor for the following fix:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2016/8159611
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8159611
>>>>>>>>
>>>>>>>>
>>>>>>>> We are inserting several checks for the arguments of
>>>>>>>> System.arraycopy() in LibraryCallKit::inline_arraycopy() before
>>>>>>>> intensifying the call in LibraryCallKit::inline_arraycopy. 
>>>>>>>> However the
>>>>>>>> check for the 'length' argument of arracopy is postponed to the 
>>>>>>>> macro
>>>>>>>> expansion phase in PhaseMacroExpand::generate_arraycopy().
>>>>>>>>
>>>>>>>> But if we are running with EscapeAnalysis and 
>>>>>>>> EliminateAllocations,
>>>>>>>> the array allocations inside a call to System.arraycopy() may get
>>>>>>>> eliminated and thus the complete call to System.arraycopy() 
>>>>>>>> will be
>>>>>>>> removed (see PhaseMacroExpand::process_users_of_allocation). In 
>>>>>>>> this
>>>>>>>> case the extra 'length' check won't be added by
>>>>>>>> PhaseMacroExpand::generate_arraycopy() any more because macro
>>>>>>>> expansion happens after the elimination of macro nodes.
>>>>>>>>
>>>>>>>> In such a case it may happen that System.arraycopy() will silently
>>>>>>>> accept an invalid (i.e. negative) 'length' parameter, although it
>>>>>>>> should actually throw an ArrayOutOfBounds exception.
>>>>>>>>
>>>>>>>> The fix is simple: also insert a check for the length field in
>>>>>>>> LibraryCallKit::inline_arraycopy() if we are running with
>>>>>>>> EliminateAllocations.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Volker
>>>>>>>>
>


From tom.rodriguez at oracle.com  Wed Sep 14 15:04:46 2016
From: tom.rodriguez at oracle.com (Tom Rodriguez)
Date: Wed, 14 Sep 2016 08:04:46 -0700
Subject: RFR: 8165434: [JVMCI] remove uses of setAccessible
In-Reply-To: <9B5CFF51-7C4E-44E8-B743-B37411E3C77C@oracle.com>
References: <BA873BD3-C571-43B2-B41F-EE9FA35FE926@oracle.com>
	<864558C5-C2AD-4D6B-BB6F-568F00BBE28A@twitter.com>
	<6224CDA0-63E6-442C-BD13-732208FA75A2@oracle.com>
	<999A422E-6CF6-45C5-955B-D58745DBB456@twitter.com>
	<21860311-D6E9-482B-B0A0-F488A516A1D3@oracle.com>
	<9B5CFF51-7C4E-44E8-B743-B37411E3C77C@oracle.com>
Message-ID: <280B564D-C015-4AF1-8B1E-B16A8BE0E7A0@oracle.com>

The updated test looks fine to me.

tom

> On Sep 13, 2016, at 3:33 PM, Doug Simon <doug.simon at oracle.com> wrote:
> 
> JPRT testing revealed a test bug in FindUniqueConcreteMethodTest.java where CompileToVM.findUniqueConcreteMethod was being called with a default method. This is not supported by HotSpot which is why the only other usage of this private API avoids it:
> 
> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/ec36e3e03d65/src/jdk.vm.ci/share/classes/jdk.vm.ci.hotspot/src/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethodImpl.java#l385
> 
> The offending test in FindUniqueConcreteMethodTest has been commented out.
> 
> -Doug
> 
>> On 08 Sep 2016, at 15:12, Doug Simon <doug.simon at oracle.com> wrote:
>> 
>>> 
>>> On 07 Sep 2016, at 19:52, Christian Thalinger <cthalinger at twitter.com> wrote:
>>> 
>>>> 
>>>> On Sep 7, 2016, at 2:29 AM, Doug Simon <doug.simon at oracle.com> wrote:
>>>> 
>>>>> 
>>>>> On 06 Sep 2016, at 20:12, Christian Thalinger <cthalinger at twitter.com> wrote:
>>>>> 
>>>>> 
>>>>>> On Sep 5, 2016, at 6:45 AM, Doug Simon <doug.simon at oracle.com> wrote:
>>>>>> 
>>>>>> JVMCI currently uses java.lang.reflect.AccessibleObject.setAccessible to get at private internals of certain JDK objects (e.g. java.lang.reflect.Method::slot). In light of changes around java.lang.reflect.AccessibleObject::setAccessible at http://openjdk.java.net/jeps/261, this may require extra command line options at some point. To avoid that, I?ve removed all uses of setAccessible in JVMCI.
>>>>>> 
>>>>>> http://cr.openjdk.java.net/~dnsimon/8165434/
>>>>> 
>>>>> src/jdk.vm.ci/share/classes/jdk.vm.ci.meta/src/jdk/vm/ci/meta/ModifiersProvider.java
>>>>> 
>>>>> +    int BRIDGE = 0x0040;
>>>>> +    int VARARGS = 0x0080;
>>>>> +    int SYNTHETIC = 0x1000;
>>>>> +    int ANNOTATION = 0x2000;
>>>>> +    int ENUM = 0x4000;
>>>>> I wish we could avoid that.  We can?t use this stuff because it?s HotSpot-dependent, right?
>>>>> +        assert ModifiersProvider.SYNTHETIC == getConstant("JVM_ACC_SYNTHETIC", Integer.class);
>>>>> +        assert ModifiersProvider.ANNOTATION == getConstant("JVM_ACC_ANNOTATION", Integer.class);
>>>>> +        assert ModifiersProvider.BRIDGE == getConstant("JVM_ACC_BRIDGE", Integer.class);
>>>>> +        assert ModifiersProvider.VARARGS == getConstant("JVM_ACC_VARARGS", Integer.class);
>>>>> +        assert ModifiersProvider.ENUM == getConstant("JVM_ACC_ENUM", Integer.class);
>>>>> What if we convert these constants to interface methods and the VM-dependent part has to implement them?  Or maybe even keep the fields and assign them via interface methods.
>>>> 
>>>> Following your suggestion, I?ve factored out these VM dependent flags to a new HotSpotModifiers class:
>>>> 
>>>> http://cr.openjdk.java.net/~dnsimon/8165434.v2/
>>> 
>>> Excellent.  One question? I noticed HotSpotModifiers is an interface but no other class implements it.  Is there a reason for it being an interface?
>> 
>> Nope. It?s now a class.
>> 
>>> 
>>> Only nit, remove 2011:
>>>  2  * Copyright (c) 2011, 2016, Oracle and/or its affiliates. All rights reserved.
>> 
>> Fixed.
>> 
>> -Doug
> 


From vitalyd at gmail.com  Wed Sep 14 15:46:03 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Wed, 14 Sep 2016 11:46:03 -0400
Subject: MaxBCEAEstimateSize and inlining clarification
In-Reply-To: <57D85E4A.2080302@oracle.com>
References: <00C16B65-A85F-491E-9384-1172735D9952@gmail.com>
	<CAHjP37Es_HUcbSHvySoCJqb7-hNjtLSCxyHi6C6rmY7gqqVOnw@mail.gmail.com>
	<CAOwENiKkrARDOur8gMbit_zrxQDHcw+ABEg29RDvvMueD5xzbA@mail.gmail.com>
	<CAHjP37G5tD8in2o42-KwUffUQhGeueL6FkAfLcpbHB6zA8bf2Q@mail.gmail.com>
	<CAOwENi+Ko64mUojCW_qR=n_J+N2WOjxb6KdrHLJ5vF_Wyqi+Sw@mail.gmail.com>
	<CAHjP37Hi66O+xZ8JKKEkReOODUqCx9iw_cNtXDY6dAUZEW7icw@mail.gmail.com>
	<CAOwENiKnhQBBDERv8ah5eT8b2mGdhO0EQ9LsmtNBYr4oyB+vDA@mail.gmail.com>
	<57D85E4A.2080302@oracle.com>
Message-ID: <CAHjP37FjaUT7S0SnVjZ4n-8qwxS5B-7GzX98QnHt+1JSrgPzqg@mail.gmail.com>

Hi Vladimir,

Do OSR compilations run EA? I'm looking at some code (roughly) like this:

while (true) {
    for (Entry<...> e : concurrentHashMap.entrySet()) {
         // e does not escape
     }
     Thread.sleep(...);
}

I see the enclosing method OSR compiled, but the iterator and entry aren't
eliminated.  Makes me wonder if OSR doesn't do EA.  Is that the case?

Thanks


On Tuesday, September 13, 2016, Vladimir Kozlov <vladimir.kozlov at oracle.com>
wrote:

> If allocation is done locally in loop it could be SR (but not guaranteed):
>
> for () {
>   Foo f = new Foo();
> }
>
> "Currently" we can't SR it if there is merge:
>
> Foo f = new Foo();
> for () {
>   f = new Foo();
> }
> x = f.x;
>
> Also we can't SR an array if it has index access because we can't map
> loads/stores to concrete element:
>
> int[] a = new int[3];
> for (i) {
>   x = a[i]
> }
>
> If elements are accessed without index (using array to pass or return
> several values) or a loop is fully unrolled we can SR it:
>
> x0 = a[0];
> x1 = a[1];
> x2 = a[2];
>
> Regards,
> Vladimir
>
> On 9/13/16 12:55 PM, Ruslan Cheremin wrote:
>
>> There was also another thread a few months back where I was asking why a
>>> small local array allocation wasn't scalarized, and the answer there was
>>> ordering between loop unrolling and EA passes (I can
>>>
>> dig up that thread if you're interested).
>>
>> It would be very nice, please -- I've tried to google it by myself
>> (because you've noted it already in the thread) but wasn't able to guess
>> right keywords :)
>>
>>
>> 2016-09-13 22:44 GMT+03:00 Vitaly Davidovich <vitalyd at gmail.com <mailto:
>> vitalyd at gmail.com>>:
>>
>>
>>
>>     On Tue, Sep 13, 2016 at 3:32 PM, Ruslan Cheremin <cheremin at gmail.com
>> <mailto:cheremin at gmail.com>> wrote:
>>
>>         >how it can be made stable to the point where you can rely/depend
>> on it for performance.
>>
>>         Well, same can be said about any JIT optimization -- (may be it
>> is time to rename dynamic runtime to stochastic runtime?). Personally I see
>> SR to be the same order of stability as inlining.
>>         Actually, apart from few SR-specific issues (like with merge
>> points), EA/SR mostly follow inlining: if you have enough scope inlined
>> you'll have, say, 80% chance of SR. From my perspective it
>>         is inlining which is so surprisingly unstable.
>>
>>     Yeah, I'd agree.  The difference, in my mind, is failing to inline a
>> function may not have as drastic performance implications as failing to
>> eliminate temporaries.
>>
>>
>>         BTW: have you considered to share you experience with EA/SR
>> pitfalls? Even if "increase likelihood" is the best option available --
>> there are still very little information about it in the net.
>>
>>     I'm kind of doing that via the few emails on this list :).  I think
>> you pretty much covered the biggest (apparent) flake in the equation -
>> inlining, which can fail for all sorts of different
>>     reasons.  Beyond that, there's the control flow insensitive aspect of
>> the EA, which is tangentially related to inlining (or lack thereof).
>>
>>     There was also another thread a few months back where I was asking
>> why a small local array allocation wasn't scalarized, and the answer there
>> was ordering between loop unrolling and EA passes (I
>>     can dig up that thread if you're interested).  The bizarre thing
>> there was the loop operation was folded into a constant, and the compiled
>> method was returning a constant value, but the array
>>     allocation was left behind (although it wasn't needed).
>>
>>     I agree that there isn't much information about EA in Hotspot
>> (there's a lot of handwaving and inaccuracies online).  In particular, it'd
>> be nice if the performance wiki had a section on making
>>     user code play well with EA (just like it has guidance on some other
>> JIT aspects currently).
>>
>>
>>         ----
>>         Ruslan
>>
>>
>>
>>         2016-09-13 21:33 GMT+03:00 Vitaly Davidovich <vitalyd at gmail.com
>> <mailto:vitalyd at gmail.com>>:
>>
>>
>>
>>             On Tue, Sep 13, 2016 at 2:25 PM, Ruslan Cheremin <
>> cheremin at gmail.com <mailto:cheremin at gmail.com>> wrote:
>>
>>                 >That's my understanding as well (and matches what I'm
>> seeing in some synthetic test harnesses).
>>
>>                 Ok, I just tried to clear it out, because it is not the
>> first time I see BCEA... noted in context of scalar replacement, and I
>> start to doubt my eyes :)
>>
>>                 >t's pretty brittle, sadly, and more importantly,
>> unstable.
>>
>>                 Making similar experiments I see the same. E.g.
>> HashMap.get(TupleKey) lookup can be successfully scalarized 99% cases, but
>> scalarization become broken once with slightly changed key
>>                 generation schema -- because hashcodes distribution
>> becomes worse, and HashMap buckets start to convert themself to TreeBins,
>> and TreeBins code is much harder task for EA.
>>
>>                 Another can of worms is mismatch between different
>> inlining heuristics. E.g. FreqInlineSize and InlineSmallCode thresholds may
>> give different decision for the same piece of code, and
>>                 taken inlining decision depends on was method already
>> compiled or not -- which depends on thinnest details of initialization
>> order and execution profile. This scenarios becomes rare in
>>                 1.8 with InlineSmallCode increased, but I'm not sure they
>> are gone...
>>
>>                 Currently, I'm starting to think code needs to be
>> specifically written for EA/SR in mind to be more-or-less stably
>> scalarized. I.e. you can't get it for free (or it will be unstable).
>>
>>             I'm not sure this is practical, to be honest, at least for a
>> big enough application.  I've long considered EA (and scalar replacement)
>> as a bonus optimization, and never to rely on it if
>>             the allocations would hurt otherwise.  I'm just a bit
>> surprised *just* how unstable it appears to be, in the "simplest" of cases.
>>
>>             I think code can be written to increase likelihood of scalar
>> replacement, but I just can't see how it can be made stable to the point
>> where you can rely/depend on it for performance.
>>
>>
>>                 ----
>>                 Ruslan
>>
>>
>>                 2016-09-13 20:51 GMT+03:00 Vitaly Davidovich <
>> vitalyd at gmail.com <mailto:vitalyd at gmail.com>>:
>>
>>
>>
>>                     On Tuesday, September 13, 2016, Cheremin Ruslan <
>> cheremin at gmail.com <mailto:cheremin at gmail.com>> wrote:
>>
>>                         > I'm seeing some code that iterates over a
>> ConcurrentHashMap's entrySet that allocates tens of GB of CHM$MapEntry
>> objects even though they don't escape
>>
>>
>>                         I'm a bit confused: I was sure BCEA-style params
>> do affect EA, but don't affect scalar replacement. With bcEscapeAnalyser
>> you can get (sort of) inter-procedural EA, but this
>>                         only allows you to have more allocations
>> identified as ArgEscape instead of GlobalEscape. But you can't get more
>> NoEscape without real inlining. ArgEscape (afaik) is used only
>>                         for synchronization removals in HotSpot, not for
>> scalar replacements.
>>
>>                         Am I incorrect?
>>
>>                     That's my understanding as well (and matches what I'm
>> seeing in some synthetic test harnesses).
>>
>>                     I'm generally seeing a lot of variability in scalar
>> replacement in particular, all driven by profile data.  HashMap<Integer,
>> ...>::get(int) sometimes works at eliminating the box
>>                     and sometimes doesn't - the difference appears to be
>> whether Integer::equals is inlined or not, which in turn depends on whether
>> the lookup finds something or not and whether the
>>                     number of successful lookups reaches compilation
>> threshold. It's pretty brittle, sadly, and more importantly, unstable.
>>
>>
>>
>>                         ----
>>                         Ruslan
>>
>>
>>
>>                     --
>>                     Sent from my phone
>>
>>
>>
>>
>>
>>
>>

-- 
Sent from my phone
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160914/c7b87439/attachment.html>

From vladimir.x.ivanov at oracle.com  Wed Sep 14 16:13:50 2016
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Wed, 14 Sep 2016 19:13:50 +0300
Subject: MaxBCEAEstimateSize and inlining clarification
In-Reply-To: <CAHjP37FjaUT7S0SnVjZ4n-8qwxS5B-7GzX98QnHt+1JSrgPzqg@mail.gmail.com>
References: <00C16B65-A85F-491E-9384-1172735D9952@gmail.com>
	<CAHjP37Es_HUcbSHvySoCJqb7-hNjtLSCxyHi6C6rmY7gqqVOnw@mail.gmail.com>
	<CAOwENiKkrARDOur8gMbit_zrxQDHcw+ABEg29RDvvMueD5xzbA@mail.gmail.com>
	<CAHjP37G5tD8in2o42-KwUffUQhGeueL6FkAfLcpbHB6zA8bf2Q@mail.gmail.com>
	<CAOwENi+Ko64mUojCW_qR=n_J+N2WOjxb6KdrHLJ5vF_Wyqi+Sw@mail.gmail.com>
	<CAHjP37Hi66O+xZ8JKKEkReOODUqCx9iw_cNtXDY6dAUZEW7icw@mail.gmail.com>
	<CAOwENiKnhQBBDERv8ah5eT8b2mGdhO0EQ9LsmtNBYr4oyB+vDA@mail.gmail.com>
	<57D85E4A.2080302@oracle.com>
	<CAHjP37FjaUT7S0SnVjZ4n-8qwxS5B-7GzX98QnHt+1JSrgPzqg@mail.gmail.com>
Message-ID: <45c9ea9a-872a-b5f7-28fc-c1211ef8f044@oracle.com>

> Do OSR compilations run EA? I'm looking at some code (roughly) like this:
>
> while (true) {
>     for (Entry<...> e : concurrentHashMap.entrySet()) {
>          // e does not escape
>      }
>      Thread.sleep(...);
> }
>
> I see the enclosing method OSR compiled, but the iterator and entry
> aren't eliminated.  Makes me wonder if OSR doesn't do EA.  Is that the case?

EA is performed for OSR compilations, but keep in mind that the entry 
point for OSR compilation is the back branch in the loop.

The whole JVM state is passed as the argument, so EA can only detect 
that something is local for the duration of a single loop iteration, not 
when something temporary is allocated for the whole loop.

It means that the iterator object can't be eliminated in OSR 
compilation. Probably, it causes the element object to escape as well.

Best regards,
Vladimir Ivanov

> On Tuesday, September 13, 2016, Vladimir Kozlov
> <vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>> wrote:
>
>     If allocation is done locally in loop it could be SR (but not
>     guaranteed):
>
>     for () {
>       Foo f = new Foo();
>     }
>
>     "Currently" we can't SR it if there is merge:
>
>     Foo f = new Foo();
>     for () {
>       f = new Foo();
>     }
>     x = f.x;
>
>     Also we can't SR an array if it has index access because we can't
>     map loads/stores to concrete element:
>
>     int[] a = new int[3];
>     for (i) {
>       x = a[i]
>     }
>
>     If elements are accessed without index (using array to pass or
>     return several values) or a loop is fully unrolled we can SR it:
>
>     x0 = a[0];
>     x1 = a[1];
>     x2 = a[2];
>
>     Regards,
>     Vladimir
>
>     On 9/13/16 12:55 PM, Ruslan Cheremin wrote:
>
>             There was also another thread a few months back where I was
>             asking why a small local array allocation wasn't scalarized,
>             and the answer there was ordering between loop unrolling and
>             EA passes (I can
>
>         dig up that thread if you're interested).
>
>         It would be very nice, please -- I've tried to google it by
>         myself (because you've noted it already in the thread) but
>         wasn't able to guess right keywords :)
>
>
>         2016-09-13 22:44 GMT+03:00 Vitaly Davidovich <vitalyd at gmail.com
>         <mailto:vitalyd at gmail.com>>:
>
>
>
>             On Tue, Sep 13, 2016 at 3:32 PM, Ruslan Cheremin
>         <cheremin at gmail.com <mailto:cheremin at gmail.com>> wrote:
>
>                 >how it can be made stable to the point where you can
>         rely/depend on it for performance.
>
>                 Well, same can be said about any JIT optimization --
>         (may be it is time to rename dynamic runtime to stochastic
>         runtime?). Personally I see SR to be the same order of stability
>         as inlining.
>                 Actually, apart from few SR-specific issues (like with
>         merge points), EA/SR mostly follow inlining: if you have enough
>         scope inlined you'll have, say, 80% chance of SR. From my
>         perspective it
>                 is inlining which is so surprisingly unstable.
>
>             Yeah, I'd agree.  The difference, in my mind, is failing to
>         inline a function may not have as drastic performance
>         implications as failing to eliminate temporaries.
>
>
>                 BTW: have you considered to share you experience with
>         EA/SR pitfalls? Even if "increase likelihood" is the best option
>         available -- there are still very little information about it in
>         the net.
>
>             I'm kind of doing that via the few emails on this list :).
>         I think you pretty much covered the biggest (apparent) flake in
>         the equation - inlining, which can fail for all sorts of different
>             reasons.  Beyond that, there's the control flow insensitive
>         aspect of the EA, which is tangentially related to inlining (or
>         lack thereof).
>
>             There was also another thread a few months back where I was
>         asking why a small local array allocation wasn't scalarized, and
>         the answer there was ordering between loop unrolling and EA
>         passes (I
>             can dig up that thread if you're interested).  The bizarre
>         thing there was the loop operation was folded into a constant,
>         and the compiled method was returning a constant value, but the
>         array
>             allocation was left behind (although it wasn't needed).
>
>             I agree that there isn't much information about EA in
>         Hotspot (there's a lot of handwaving and inaccuracies online).
>         In particular, it'd be nice if the performance wiki had a
>         section on making
>             user code play well with EA (just like it has guidance on
>         some other JIT aspects currently).
>
>
>                 ----
>                 Ruslan
>
>
>
>                 2016-09-13 21:33 GMT+03:00 Vitaly Davidovich
>         <vitalyd at gmail.com <mailto:vitalyd at gmail.com>>:
>
>
>
>                     On Tue, Sep 13, 2016 at 2:25 PM, Ruslan Cheremin
>         <cheremin at gmail.com <mailto:cheremin at gmail.com>> wrote:
>
>                         >That's my understanding as well (and matches
>         what I'm seeing in some synthetic test harnesses).
>
>                         Ok, I just tried to clear it out, because it is
>         not the first time I see BCEA... noted in context of scalar
>         replacement, and I start to doubt my eyes :)
>
>                         >t's pretty brittle, sadly, and more
>         importantly, unstable.
>
>                         Making similar experiments I see the same. E.g.
>         HashMap.get(TupleKey) lookup can be successfully scalarized 99%
>         cases, but scalarization become broken once with slightly
>         changed key
>                         generation schema -- because hashcodes
>         distribution becomes worse, and HashMap buckets start to convert
>         themself to TreeBins, and TreeBins code is much harder task for EA.
>
>                         Another can of worms is mismatch between
>         different inlining heuristics. E.g. FreqInlineSize and
>         InlineSmallCode thresholds may give different decision for the
>         same piece of code, and
>                         taken inlining decision depends on was method
>         already compiled or not -- which depends on thinnest details of
>         initialization order and execution profile. This scenarios
>         becomes rare in
>                         1.8 with InlineSmallCode increased, but I'm not
>         sure they are gone...
>
>                         Currently, I'm starting to think code needs to
>         be specifically written for EA/SR in mind to be more-or-less
>         stably scalarized. I.e. you can't get it for free (or it will be
>         unstable).
>
>                     I'm not sure this is practical, to be honest, at
>         least for a big enough application.  I've long considered EA
>         (and scalar replacement) as a bonus optimization, and never to
>         rely on it if
>                     the allocations would hurt otherwise.  I'm just a
>         bit surprised *just* how unstable it appears to be, in the
>         "simplest" of cases.
>
>                     I think code can be written to increase likelihood
>         of scalar replacement, but I just can't see how it can be made
>         stable to the point where you can rely/depend on it for performance.
>
>
>                         ----
>                         Ruslan
>
>
>                         2016-09-13 20:51 GMT+03:00 Vitaly Davidovich
>         <vitalyd at gmail.com <mailto:vitalyd at gmail.com>>:
>
>
>
>                             On Tuesday, September 13, 2016, Cheremin
>         Ruslan <cheremin at gmail.com <mailto:cheremin at gmail.com>> wrote:
>
>                                 > I'm seeing some code that iterates
>         over a ConcurrentHashMap's entrySet that allocates tens of GB of
>         CHM$MapEntry objects even though they don't escape
>
>
>                                 I'm a bit confused: I was sure
>         BCEA-style params do affect EA, but don't affect scalar
>         replacement. With bcEscapeAnalyser you can get (sort of)
>         inter-procedural EA, but this
>                                 only allows you to have more allocations
>         identified as ArgEscape instead of GlobalEscape. But you can't
>         get more NoEscape without real inlining. ArgEscape (afaik) is
>         used only
>                                 for synchronization removals in HotSpot,
>         not for scalar replacements.
>
>                                 Am I incorrect?
>
>                             That's my understanding as well (and matches
>         what I'm seeing in some synthetic test harnesses).
>
>                             I'm generally seeing a lot of variability in
>         scalar replacement in particular, all driven by profile data.
>         HashMap<Integer, ...>::get(int) sometimes works at eliminating
>         the box
>                             and sometimes doesn't - the difference
>         appears to be whether Integer::equals is inlined or not, which
>         in turn depends on whether the lookup finds something or not and
>         whether the
>                             number of successful lookups reaches
>         compilation threshold. It's pretty brittle, sadly, and more
>         importantly, unstable.
>
>
>
>                                 ----
>                                 Ruslan
>
>
>
>                             --
>                             Sent from my phone
>
>
>
>
>
>
>
>
> --
> Sent from my phone

From vitalyd at gmail.com  Wed Sep 14 16:15:41 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Wed, 14 Sep 2016 12:15:41 -0400
Subject: MaxBCEAEstimateSize and inlining clarification
In-Reply-To: <CAHjP37FjaUT7S0SnVjZ4n-8qwxS5B-7GzX98QnHt+1JSrgPzqg@mail.gmail.com>
References: <00C16B65-A85F-491E-9384-1172735D9952@gmail.com>
	<CAHjP37Es_HUcbSHvySoCJqb7-hNjtLSCxyHi6C6rmY7gqqVOnw@mail.gmail.com>
	<CAOwENiKkrARDOur8gMbit_zrxQDHcw+ABEg29RDvvMueD5xzbA@mail.gmail.com>
	<CAHjP37G5tD8in2o42-KwUffUQhGeueL6FkAfLcpbHB6zA8bf2Q@mail.gmail.com>
	<CAOwENi+Ko64mUojCW_qR=n_J+N2WOjxb6KdrHLJ5vF_Wyqi+Sw@mail.gmail.com>
	<CAHjP37Hi66O+xZ8JKKEkReOODUqCx9iw_cNtXDY6dAUZEW7icw@mail.gmail.com>
	<CAOwENiKnhQBBDERv8ah5eT8b2mGdhO0EQ9LsmtNBYr4oyB+vDA@mail.gmail.com>
	<57D85E4A.2080302@oracle.com>
	<CAHjP37FjaUT7S0SnVjZ4n-8qwxS5B-7GzX98QnHt+1JSrgPzqg@mail.gmail.com>
Message-ID: <CAHjP37GA1xYWj+c4wHe7eQ1z7ry7bYtpYP0iBBOQw0dzHy_sCg@mail.gmail.com>

Looking at PrintInlining output of an application with places where SR
isn't happening (but should, in my mind), it appears that lots of call
graphs along the path where the object "escapes" end because some part of
the path fails to inline with "already compiled into a big method" failure
reason.  So basically we end up hitting a "black hole" along the way, and
the JIT can no longer prove the object doesn't escape.

I wonder how 1000 (when tiered is disabled) for 64bit was chosen as the
default value for InlineSmallCode - is that still the current thinking as a
good default? I understand the rationale for this check, but it also seems
like this heuristic is somewhat problematic; how do we, for example, know
that inlining that method (and whatever was inlined into it to cause it to
be > InlineSmallCode) won't produce smaller machine code because more
optimizations can be done? It also seems like it would be nice to force
inlining if bcEscapeAnalysis estimates that some allocations can go away as
a result.

Also, is the size of the method already taking into account any
untaken/cold code pruning that was done prior to code gen? I assume so, but
just wanted to check.

Finally, would it be possible to print out the actual native code size as
part of the "already compiled into a big method" message? Otherwise, it's
hard to say what value I should try for InlineSmallCode.

Thanks

P.S. When is @ForceInline going to be part of Java SE? :)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160914/f09966da/attachment.html>

From vitalyd at gmail.com  Wed Sep 14 16:18:19 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Wed, 14 Sep 2016 12:18:19 -0400
Subject: MaxBCEAEstimateSize and inlining clarification
In-Reply-To: <45c9ea9a-872a-b5f7-28fc-c1211ef8f044@oracle.com>
References: <00C16B65-A85F-491E-9384-1172735D9952@gmail.com>
	<CAHjP37Es_HUcbSHvySoCJqb7-hNjtLSCxyHi6C6rmY7gqqVOnw@mail.gmail.com>
	<CAOwENiKkrARDOur8gMbit_zrxQDHcw+ABEg29RDvvMueD5xzbA@mail.gmail.com>
	<CAHjP37G5tD8in2o42-KwUffUQhGeueL6FkAfLcpbHB6zA8bf2Q@mail.gmail.com>
	<CAOwENi+Ko64mUojCW_qR=n_J+N2WOjxb6KdrHLJ5vF_Wyqi+Sw@mail.gmail.com>
	<CAHjP37Hi66O+xZ8JKKEkReOODUqCx9iw_cNtXDY6dAUZEW7icw@mail.gmail.com>
	<CAOwENiKnhQBBDERv8ah5eT8b2mGdhO0EQ9LsmtNBYr4oyB+vDA@mail.gmail.com>
	<57D85E4A.2080302@oracle.com>
	<CAHjP37FjaUT7S0SnVjZ4n-8qwxS5B-7GzX98QnHt+1JSrgPzqg@mail.gmail.com>
	<45c9ea9a-872a-b5f7-28fc-c1211ef8f044@oracle.com>
Message-ID: <CAHjP37HJYQr3R26dTWeAfcLCXj6X9wT4hJsOS+PMjnftQx1=AA@mail.gmail.com>

On Wed, Sep 14, 2016 at 12:13 PM, Vladimir Ivanov <
vladimir.x.ivanov at oracle.com> wrote:

> Do OSR compilations run EA? I'm looking at some code (roughly) like this:
>>
>> while (true) {
>>     for (Entry<...> e : concurrentHashMap.entrySet()) {
>>          // e does not escape
>>      }
>>      Thread.sleep(...);
>> }
>>
>> I see the enclosing method OSR compiled, but the iterator and entry
>> aren't eliminated.  Makes me wonder if OSR doesn't do EA.  Is that the
>> case?
>>
>
> EA is performed for OSR compilations, but keep in mind that the entry
> point for OSR compilation is the back branch in the loop.
>
> The whole JVM state is passed as the argument, so EA can only detect that
> something is local for the duration of a single loop iteration, not when
> something temporary is allocated for the whole loop.
>
> It means that the iterator object can't be eliminated in OSR compilation.
> Probably, it causes the element object to escape as well.
>
Darn! Ok, thanks Vladimir - that would explain what I'm seeing.  So
basically need to find a way to avoid OSR compiles for cases like this.

>
> Best regards,
> Vladimir Ivanov
>
> On Tuesday, September 13, 2016, Vladimir Kozlov
>> <vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>> wrote:
>>
>>     If allocation is done locally in loop it could be SR (but not
>>     guaranteed):
>>
>>     for () {
>>       Foo f = new Foo();
>>     }
>>
>>     "Currently" we can't SR it if there is merge:
>>
>>     Foo f = new Foo();
>>     for () {
>>       f = new Foo();
>>     }
>>     x = f.x;
>>
>>     Also we can't SR an array if it has index access because we can't
>>     map loads/stores to concrete element:
>>
>>     int[] a = new int[3];
>>     for (i) {
>>       x = a[i]
>>     }
>>
>>     If elements are accessed without index (using array to pass or
>>     return several values) or a loop is fully unrolled we can SR it:
>>
>>     x0 = a[0];
>>     x1 = a[1];
>>     x2 = a[2];
>>
>>     Regards,
>>     Vladimir
>>
>>     On 9/13/16 12:55 PM, Ruslan Cheremin wrote:
>>
>>             There was also another thread a few months back where I was
>>             asking why a small local array allocation wasn't scalarized,
>>             and the answer there was ordering between loop unrolling and
>>             EA passes (I can
>>
>>         dig up that thread if you're interested).
>>
>>         It would be very nice, please -- I've tried to google it by
>>         myself (because you've noted it already in the thread) but
>>         wasn't able to guess right keywords :)
>>
>>
>>         2016-09-13 22:44 GMT+03:00 Vitaly Davidovich <vitalyd at gmail.com
>>         <mailto:vitalyd at gmail.com>>:
>>
>>
>>
>>             On Tue, Sep 13, 2016 at 3:32 PM, Ruslan Cheremin
>>         <cheremin at gmail.com <mailto:cheremin at gmail.com>> wrote:
>>
>>                 >how it can be made stable to the point where you can
>>         rely/depend on it for performance.
>>
>>                 Well, same can be said about any JIT optimization --
>>         (may be it is time to rename dynamic runtime to stochastic
>>         runtime?). Personally I see SR to be the same order of stability
>>         as inlining.
>>                 Actually, apart from few SR-specific issues (like with
>>         merge points), EA/SR mostly follow inlining: if you have enough
>>         scope inlined you'll have, say, 80% chance of SR. From my
>>         perspective it
>>                 is inlining which is so surprisingly unstable.
>>
>>             Yeah, I'd agree.  The difference, in my mind, is failing to
>>         inline a function may not have as drastic performance
>>         implications as failing to eliminate temporaries.
>>
>>
>>                 BTW: have you considered to share you experience with
>>         EA/SR pitfalls? Even if "increase likelihood" is the best option
>>         available -- there are still very little information about it in
>>         the net.
>>
>>             I'm kind of doing that via the few emails on this list :).
>>         I think you pretty much covered the biggest (apparent) flake in
>>         the equation - inlining, which can fail for all sorts of different
>>             reasons.  Beyond that, there's the control flow insensitive
>>         aspect of the EA, which is tangentially related to inlining (or
>>         lack thereof).
>>
>>             There was also another thread a few months back where I was
>>         asking why a small local array allocation wasn't scalarized, and
>>         the answer there was ordering between loop unrolling and EA
>>         passes (I
>>             can dig up that thread if you're interested).  The bizarre
>>         thing there was the loop operation was folded into a constant,
>>         and the compiled method was returning a constant value, but the
>>         array
>>             allocation was left behind (although it wasn't needed).
>>
>>             I agree that there isn't much information about EA in
>>         Hotspot (there's a lot of handwaving and inaccuracies online).
>>         In particular, it'd be nice if the performance wiki had a
>>         section on making
>>             user code play well with EA (just like it has guidance on
>>         some other JIT aspects currently).
>>
>>
>>                 ----
>>                 Ruslan
>>
>>
>>
>>                 2016-09-13 21:33 GMT+03:00 Vitaly Davidovich
>>         <vitalyd at gmail.com <mailto:vitalyd at gmail.com>>:
>>
>>
>>
>>                     On Tue, Sep 13, 2016 at 2:25 PM, Ruslan Cheremin
>>         <cheremin at gmail.com <mailto:cheremin at gmail.com>> wrote:
>>
>>                         >That's my understanding as well (and matches
>>         what I'm seeing in some synthetic test harnesses).
>>
>>                         Ok, I just tried to clear it out, because it is
>>         not the first time I see BCEA... noted in context of scalar
>>         replacement, and I start to doubt my eyes :)
>>
>>                         >t's pretty brittle, sadly, and more
>>         importantly, unstable.
>>
>>                         Making similar experiments I see the same. E.g.
>>         HashMap.get(TupleKey) lookup can be successfully scalarized 99%
>>         cases, but scalarization become broken once with slightly
>>         changed key
>>                         generation schema -- because hashcodes
>>         distribution becomes worse, and HashMap buckets start to convert
>>         themself to TreeBins, and TreeBins code is much harder task for
>> EA.
>>
>>                         Another can of worms is mismatch between
>>         different inlining heuristics. E.g. FreqInlineSize and
>>         InlineSmallCode thresholds may give different decision for the
>>         same piece of code, and
>>                         taken inlining decision depends on was method
>>         already compiled or not -- which depends on thinnest details of
>>         initialization order and execution profile. This scenarios
>>         becomes rare in
>>                         1.8 with InlineSmallCode increased, but I'm not
>>         sure they are gone...
>>
>>                         Currently, I'm starting to think code needs to
>>         be specifically written for EA/SR in mind to be more-or-less
>>         stably scalarized. I.e. you can't get it for free (or it will be
>>         unstable).
>>
>>                     I'm not sure this is practical, to be honest, at
>>         least for a big enough application.  I've long considered EA
>>         (and scalar replacement) as a bonus optimization, and never to
>>         rely on it if
>>                     the allocations would hurt otherwise.  I'm just a
>>         bit surprised *just* how unstable it appears to be, in the
>>         "simplest" of cases.
>>
>>                     I think code can be written to increase likelihood
>>         of scalar replacement, but I just can't see how it can be made
>>         stable to the point where you can rely/depend on it for
>> performance.
>>
>>
>>                         ----
>>                         Ruslan
>>
>>
>>                         2016-09-13 20:51 GMT+03:00 Vitaly Davidovich
>>         <vitalyd at gmail.com <mailto:vitalyd at gmail.com>>:
>>
>>
>>
>>                             On Tuesday, September 13, 2016, Cheremin
>>         Ruslan <cheremin at gmail.com <mailto:cheremin at gmail.com>> wrote:
>>
>>                                 > I'm seeing some code that iterates
>>         over a ConcurrentHashMap's entrySet that allocates tens of GB of
>>         CHM$MapEntry objects even though they don't escape
>>
>>
>>                                 I'm a bit confused: I was sure
>>         BCEA-style params do affect EA, but don't affect scalar
>>         replacement. With bcEscapeAnalyser you can get (sort of)
>>         inter-procedural EA, but this
>>                                 only allows you to have more allocations
>>         identified as ArgEscape instead of GlobalEscape. But you can't
>>         get more NoEscape without real inlining. ArgEscape (afaik) is
>>         used only
>>                                 for synchronization removals in HotSpot,
>>         not for scalar replacements.
>>
>>                                 Am I incorrect?
>>
>>                             That's my understanding as well (and matches
>>         what I'm seeing in some synthetic test harnesses).
>>
>>                             I'm generally seeing a lot of variability in
>>         scalar replacement in particular, all driven by profile data.
>>         HashMap<Integer, ...>::get(int) sometimes works at eliminating
>>         the box
>>                             and sometimes doesn't - the difference
>>         appears to be whether Integer::equals is inlined or not, which
>>         in turn depends on whether the lookup finds something or not and
>>         whether the
>>                             number of successful lookups reaches
>>         compilation threshold. It's pretty brittle, sadly, and more
>>         importantly, unstable.
>>
>>
>>
>>                                 ----
>>                                 Ruslan
>>
>>
>>
>>                             --
>>                             Sent from my phone
>>
>>
>>
>>
>>
>>
>>
>>
>> --
>> Sent from my phone
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160914/a6ef496d/attachment-0001.html>

From cheremin at gmail.com  Wed Sep 14 17:50:42 2016
From: cheremin at gmail.com (Ruslan)
Date: Wed, 14 Sep 2016 20:50:42 +0300
Subject: MaxBCEAEstimateSize and inlining clarification
In-Reply-To: <CAHjP37GA1xYWj+c4wHe7eQ1z7ry7bYtpYP0iBBOQw0dzHy_sCg@mail.gmail.com>
References: <00C16B65-A85F-491E-9384-1172735D9952@gmail.com>
	<CAHjP37Es_HUcbSHvySoCJqb7-hNjtLSCxyHi6C6rmY7gqqVOnw@mail.gmail.com>
	<CAOwENiKkrARDOur8gMbit_zrxQDHcw+ABEg29RDvvMueD5xzbA@mail.gmail.com>
	<CAHjP37G5tD8in2o42-KwUffUQhGeueL6FkAfLcpbHB6zA8bf2Q@mail.gmail.com>
	<CAOwENi+Ko64mUojCW_qR=n_J+N2WOjxb6KdrHLJ5vF_Wyqi+Sw@mail.gmail.com>
	<CAHjP37Hi66O+xZ8JKKEkReOODUqCx9iw_cNtXDY6dAUZEW7icw@mail.gmail.com>
	<CAOwENiKnhQBBDERv8ah5eT8b2mGdhO0EQ9LsmtNBYr4oyB+vDA@mail.gmail.com>
	<57D85E4A.2080302@oracle.com>
	<CAHjP37FjaUT7S0SnVjZ4n-8qwxS5B-7GzX98QnHt+1JSrgPzqg@mail.gmail.com>
	<CAHjP37GA1xYWj+c4wHe7eQ1z7ry7bYtpYP0iBBOQw0dzHy_sCg@mail.gmail.com>
Message-ID: <A449FDAE-20A4-4E33-BBA4-00F12E5E02AB@gmail.com>

Afaik, InlineSmallCode is 2000 in 1.8+. Which makes such scenarios not so often
----
Ruslan

> 14 ????. 2016 ?., ? 19:15, Vitaly Davidovich <vitalyd at gmail.com> ???????(?):
> 
> Looking at PrintInlining output of an application with places where SR isn't happening (but should, in my mind), it appears that lots of call graphs along the path where the object "escapes" end because some part of the path fails to inline with "already compiled into a big method" failure reason.  So basically we end up hitting a "black hole" along the way, and the JIT can no longer prove the object doesn't escape.
> 
> I wonder how 1000 (when tiered is disabled) for 64bit was chosen as the default value for InlineSmallCode - is that still the current thinking as a good default? I understand the rationale for this check, but it also seems like this heuristic is somewhat problematic; how do we, for example, know that inlining that method (and whatever was inlined into it to cause it to be > InlineSmallCode) won't produce smaller machine code because more optimizations can be done? It also seems like it would be nice to force inlining if bcEscapeAnalysis estimates that some allocations can go away as a result.
> 
> Also, is the size of the method already taking into account any untaken/cold code pruning that was done prior to code gen? I assume so, but just wanted to check.
> 
> Finally, would it be possible to print out the actual native code size as part of the "already compiled into a big method" message? Otherwise, it's hard to say what value I should try for InlineSmallCode.
> 
> Thanks
> 
> P.S. When is @ForceInline going to be part of Java SE? :)

From vitalyd at gmail.com  Wed Sep 14 17:52:57 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Wed, 14 Sep 2016 13:52:57 -0400
Subject: MaxBCEAEstimateSize and inlining clarification
In-Reply-To: <A449FDAE-20A4-4E33-BBA4-00F12E5E02AB@gmail.com>
References: <00C16B65-A85F-491E-9384-1172735D9952@gmail.com>
	<CAHjP37Es_HUcbSHvySoCJqb7-hNjtLSCxyHi6C6rmY7gqqVOnw@mail.gmail.com>
	<CAOwENiKkrARDOur8gMbit_zrxQDHcw+ABEg29RDvvMueD5xzbA@mail.gmail.com>
	<CAHjP37G5tD8in2o42-KwUffUQhGeueL6FkAfLcpbHB6zA8bf2Q@mail.gmail.com>
	<CAOwENi+Ko64mUojCW_qR=n_J+N2WOjxb6KdrHLJ5vF_Wyqi+Sw@mail.gmail.com>
	<CAHjP37Hi66O+xZ8JKKEkReOODUqCx9iw_cNtXDY6dAUZEW7icw@mail.gmail.com>
	<CAOwENiKnhQBBDERv8ah5eT8b2mGdhO0EQ9LsmtNBYr4oyB+vDA@mail.gmail.com>
	<57D85E4A.2080302@oracle.com>
	<CAHjP37FjaUT7S0SnVjZ4n-8qwxS5B-7GzX98QnHt+1JSrgPzqg@mail.gmail.com>
	<CAHjP37GA1xYWj+c4wHe7eQ1z7ry7bYtpYP0iBBOQw0dzHy_sCg@mail.gmail.com>
	<A449FDAE-20A4-4E33-BBA4-00F12E5E02AB@gmail.com>
Message-ID: <CAHjP37E+V0Wi5f4Yy-sFL4AMNsdkiXoThX_1V=NoUqZYxLy60w@mail.gmail.com>

On Wed, Sep 14, 2016 at 1:50 PM, Ruslan <cheremin at gmail.com> wrote:

> Afaik, InlineSmallCode is 2000 in 1.8+. Which makes such scenarios not so
> often
>
If you disable tiered compilation, it's 1000.

----
> Ruslan
>
> > 14 ????. 2016 ?., ? 19:15, Vitaly Davidovich <vitalyd at gmail.com>
> ???????(?):
> >
> > Looking at PrintInlining output of an application with places where SR
> isn't happening (but should, in my mind), it appears that lots of call
> graphs along the path where the object "escapes" end because some part of
> the path fails to inline with "already compiled into a big method" failure
> reason.  So basically we end up hitting a "black hole" along the way, and
> the JIT can no longer prove the object doesn't escape.
> >
> > I wonder how 1000 (when tiered is disabled) for 64bit was chosen as the
> default value for InlineSmallCode - is that still the current thinking as a
> good default? I understand the rationale for this check, but it also seems
> like this heuristic is somewhat problematic; how do we, for example, know
> that inlining that method (and whatever was inlined into it to cause it to
> be > InlineSmallCode) won't produce smaller machine code because more
> optimizations can be done? It also seems like it would be nice to force
> inlining if bcEscapeAnalysis estimates that some allocations can go away as
> a result.
> >
> > Also, is the size of the method already taking into account any
> untaken/cold code pruning that was done prior to code gen? I assume so, but
> just wanted to check.
> >
> > Finally, would it be possible to print out the actual native code size
> as part of the "already compiled into a big method" message? Otherwise,
> it's hard to say what value I should try for InlineSmallCode.
> >
> > Thanks
> >
> > P.S. When is @ForceInline going to be part of Java SE? :)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160914/bd4ad854/attachment.html>

From vladimir.kozlov at oracle.com  Wed Sep 14 18:11:28 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 14 Sep 2016 11:11:28 -0700
Subject: Possible integer overflow in LIRGenerator::generate_address on
	SPARC and other platforms
In-Reply-To: <1c3f2f5368754962a6e2b2684c6e1fa2@DEWDFE13DE14.global.corp.sap>
References: <1c3f2f5368754962a6e2b2684c6e1fa2@DEWDFE13DE14.global.corp.sap>
Message-ID: <57D992D0.4010009@oracle.com>

CC to group since I am not familiar with C1.

On SPARC generate_address() is called only from

   LIR_Address* generate_address(LIR_Opr base, int disp, BasicType type) {
     return generate_address(base, LIR_OprFact::illegalOpr, 0, disp, type);
   }

So it is not a issue. But I agree with you in general.

On x86 LIRGenerator::emit_array_address() may have this problem.

The only explanation I see that we did not hit it is Interpreter may be more careful about checking it and throw exception.

It could be C1 check this values somewhere else.

Thanks,
Vladimir

On 9/6/16 9:21 AM, Doerr, Martin wrote:
> Hi Vladimir,
>
> I was wondering about the following code in LIRGenerator::generate_address in c1_LIRGenerator_sparc.cpp (and some other platforms):
>
> if (index->is_constant()) {
>
>      disp += index->as_constant_ptr()->as_jint() << shift;
>
> It?s fine to compute the constant in general, but disp is an int!
>
> Seems like the only user of this function which uses an index is Unsafe put/get where nobody has noticed it yet.
>
> Do you think we have to fix this in 9?
>
> I can open a bug if you like.
>
> Best regards,
>
> Martin
>

From dean.long at oracle.com  Wed Sep 14 21:26:14 2016
From: dean.long at oracle.com (dean.long at oracle.com)
Date: Wed, 14 Sep 2016 14:26:14 -0700
Subject: Possible integer overflow in LIRGenerator::generate_address on
	SPARC and other platforms
In-Reply-To: <57D992D0.4010009@oracle.com>
References: <1c3f2f5368754962a6e2b2684c6e1fa2@DEWDFE13DE14.global.corp.sap>
	<57D992D0.4010009@oracle.com>
Message-ID: <d2b80789-6c33-3dcb-28e4-361d3ded9f92@oracle.com>

For sparc, I think this is used by LIRGenerator::do_LoadField() and 
do_StoreField(), so "disp" should be limited by the size of object.  I 
didn't find sparc using it for Unsafe put.

dl


On 9/14/16 11:11 AM, Vladimir Kozlov wrote:
> CC to group since I am not familiar with C1.
>
> On SPARC generate_address() is called only from
>
>   LIR_Address* generate_address(LIR_Opr base, int disp, BasicType type) {
>     return generate_address(base, LIR_OprFact::illegalOpr, 0, disp, 
> type);
>   }
>
> So it is not a issue. But I agree with you in general.
>
> On x86 LIRGenerator::emit_array_address() may have this problem.
>
> The only explanation I see that we did not hit it is Interpreter may 
> be more careful about checking it and throw exception.
>
> It could be C1 check this values somewhere else.
>
> Thanks,
> Vladimir
>
> On 9/6/16 9:21 AM, Doerr, Martin wrote:
>> Hi Vladimir,
>>
>> I was wondering about the following code in 
>> LIRGenerator::generate_address in c1_LIRGenerator_sparc.cpp (and some 
>> other platforms):
>>
>> if (index->is_constant()) {
>>
>>      disp += index->as_constant_ptr()->as_jint() << shift;
>>
>> It?s fine to compute the constant in general, but disp is an int!
>>
>> Seems like the only user of this function which uses an index is 
>> Unsafe put/get where nobody has noticed it yet.
>>
>> Do you think we have to fix this in 9?
>>
>> I can open a bug if you like.
>>
>> Best regards,
>>
>> Martin
>>


From david.d.leopoldseder at oracle.com  Thu Sep 15 14:10:05 2016
From: david.d.leopoldseder at oracle.com (David Leopoldseder)
Date: Thu, 15 Sep 2016 16:10:05 +0200
Subject: RFR: 8166125: [JVMCI] Missing JVMCI flag default values
Message-ID: <57a99e49-68ad-8e21-0736-8fda150bb7f4@oracle.com>

Hi,

Please review this patch.

Bug:
     During the initial commit for the JVMCI JEP some options JVMCI sets 
differently than c2 have been forgotten.
Fix:
     Set the options if INCLUDE_JVMCI is true and -XX:+UseJVMCICompiler.

http://cr.openjdk.java.net/~davleopo/JDK-8166125/webrev.001/
https://bugs.openjdk.java.net/browse/JDK-8166125

- david
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160915/466c3e43/attachment.html>

From martin.doerr at sap.com  Thu Sep 15 15:13:04 2016
From: martin.doerr at sap.com (Doerr, Martin)
Date: Thu, 15 Sep 2016 15:13:04 +0000
Subject: Possible integer overflow in LIRGenerator::generate_address on
	SPARC and other platforms
In-Reply-To: <57D992D0.4010009@oracle.com>
References: <1c3f2f5368754962a6e2b2684c6e1fa2@DEWDFE13DE14.global.corp.sap>
	<57D992D0.4010009@oracle.com>
Message-ID: <a8b5be48b7414ed4ad627cb964e8ab89@DEWDFE13DE14.global.corp.sap>

Hi Vladimir,

thanks for taking a look. I'll provide a webrev and send a RFR.

Best regards,
Martin

-----Original Message-----
From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] 
Sent: Mittwoch, 14. September 2016 20:11
To: Doerr, Martin <martin.doerr at sap.com>
Cc: hotspot compiler <hotspot-compiler-dev at openjdk.java.net>
Subject: Re: Possible integer overflow in LIRGenerator::generate_address on SPARC and other platforms

CC to group since I am not familiar with C1.

On SPARC generate_address() is called only from

   LIR_Address* generate_address(LIR_Opr base, int disp, BasicType type) {
     return generate_address(base, LIR_OprFact::illegalOpr, 0, disp, type);
   }

So it is not a issue. But I agree with you in general.

On x86 LIRGenerator::emit_array_address() may have this problem.

The only explanation I see that we did not hit it is Interpreter may be more careful about checking it and throw exception.

It could be C1 check this values somewhere else.

Thanks,
Vladimir

On 9/6/16 9:21 AM, Doerr, Martin wrote:
> Hi Vladimir,
>
> I was wondering about the following code in LIRGenerator::generate_address in c1_LIRGenerator_sparc.cpp (and some other platforms):
>
> if (index->is_constant()) {
>
>      disp += index->as_constant_ptr()->as_jint() << shift;
>
> It's fine to compute the constant in general, but disp is an int!
>
> Seems like the only user of this function which uses an index is Unsafe put/get where nobody has noticed it yet.
>
> Do you think we have to fix this in 9?
>
> I can open a bug if you like.
>
> Best regards,
>
> Martin
>

From martin.doerr at sap.com  Thu Sep 15 15:25:05 2016
From: martin.doerr at sap.com (Doerr, Martin)
Date: Thu, 15 Sep 2016 15:25:05 +0000
Subject: RFR(M): 8166140: C1: Possible integer overflow in
	LIRGenerator::generate_address on several platforms
Message-ID: <29e2b45c984248da8172cf921b7811a6@DEWDFE13DE14.global.corp.sap>

Hi,

as discussed with Vladimir, C1 contains code to simplify constant index/displacement addressing which uses int. However, int may overflow on 64 bit platforms.

Please review the following webrev:
http://cr.openjdk.java.net/~mdoerr/8166140_C1_int_overflow/webrev.00/

I'll also need a sponsor, please.

Thanks and best regards,
Martin

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160915/08e3d22f/attachment.html>

From dmitrij.pochepko at oracle.com  Thu Sep 15 16:57:58 2016
From: dmitrij.pochepko at oracle.com (Dmitrij Pochepko)
Date: Thu, 15 Sep 2016 19:57:58 +0300
Subject: RFR: 8166146 - [Testbug] update codecache tests with minimal vm filter
Message-ID: <d7820df3-2317-55cf-8b3c-8c017a3fe479@oracle.com>

Hi,

please  review small fix for 8166146 - [Testbug] update codecache tests 
with minimal vm filter

This patch update codecache tests which are not applicable for minimal 
vm with respective requires expression


webrev: http://cr.openjdk.java.net/~dpochepk/8166146/webrev.01/

CR: https://bugs.openjdk.java.net/browse/JDK-8166146

I've tested this change on linux-x86 using minimal vm


Thanks,

Dmitrij


From igor.ignatyev at oracle.com  Thu Sep 15 17:00:59 2016
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Thu, 15 Sep 2016 20:00:59 +0300
Subject: RFR: 8166146 - [Testbug] update codecache tests with minimal vm
	filter
In-Reply-To: <d7820df3-2317-55cf-8b3c-8c017a3fe479@oracle.com>
References: <d7820df3-2317-55cf-8b3c-8c017a3fe479@oracle.com>
Message-ID: <6B559DCA-5732-456C-9516-197B0252B452@oracle.com>

Dmitrij,

could you please explain why those tests are applicable for minimal vm?

Thanks,
? Igor

> On Sep 15, 2016, at 7:57 PM, Dmitrij Pochepko <dmitrij.pochepko at oracle.com> wrote:
> 
> Hi,
> 
> please  review small fix for 8166146 - [Testbug] update codecache tests with minimal vm filter
> 
> This patch update codecache tests which are not applicable for minimal vm with respective requires expression
> 
> 
> webrev: http://cr.openjdk.java.net/~dpochepk/8166146/webrev.01/
> 
> CR: https://bugs.openjdk.java.net/browse/JDK-8166146
> 
> I've tested this change on linux-x86 using minimal vm
> 
> 
> Thanks,
> 
> Dmitrij
> 


From dmitrij.pochepko at oracle.com  Thu Sep 15 17:12:03 2016
From: dmitrij.pochepko at oracle.com (Dmitrij Pochepko)
Date: Thu, 15 Sep 2016 20:12:03 +0300
Subject: RFR: 8166146 - [Testbug] update codecache tests with minimal vm
	filter
In-Reply-To: <6B559DCA-5732-456C-9516-197B0252B452@oracle.com>
References: <d7820df3-2317-55cf-8b3c-8c017a3fe479@oracle.com>
	<6B559DCA-5732-456C-9516-197B0252B452@oracle.com>
Message-ID: <e25c7337-a4c9-635a-dedc-3509e77abb2f@oracle.com>

Hi,

these tests are using java.management module, which is incompatible with 
minimal vm(even if this modules is present in tested image).

Thanks,
Dmitrij
> Dmitrij,
>
> could you please explain why those tests are applicable for minimal vm?
>
> Thanks,
> ? Igor
>
>> On Sep 15, 2016, at 7:57 PM, Dmitrij Pochepko <dmitrij.pochepko at oracle.com> wrote:
>>
>> Hi,
>>
>> please  review small fix for 8166146 - [Testbug] update codecache tests with minimal vm filter
>>
>> This patch update codecache tests which are not applicable for minimal vm with respective requires expression
>>
>>
>> webrev: http://cr.openjdk.java.net/~dpochepk/8166146/webrev.01/
>>
>> CR: https://bugs.openjdk.java.net/browse/JDK-8166146
>>
>> I've tested this change on linux-x86 using minimal vm
>>
>>
>> Thanks,
>>
>> Dmitrij
>>


From igor.ignatyev at oracle.com  Thu Sep 15 17:18:58 2016
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Thu, 15 Sep 2016 20:18:58 +0300
Subject: RFR: 8166146 - [Testbug] update codecache tests with minimal vm
	filter
In-Reply-To: <e25c7337-a4c9-635a-dedc-3509e77abb2f@oracle.com>
References: <d7820df3-2317-55cf-8b3c-8c017a3fe479@oracle.com>
	<6B559DCA-5732-456C-9516-197B0252B452@oracle.com>
	<e25c7337-a4c9-635a-dedc-3509e77abb2f@oracle.com>
Message-ID: <C6F7B60D-9E80-423A-AE57-F8F3AD4F24F6@oracle.com>

The tests have '@modules java.management?, so it?s jtreg responsibilities to filter them out if this module is incompatible/unavailable. I don?t think we should clutter up tests w/ unneeded directives, otherwise you will have to go and update all tests which use modules from compact3+, and you will have to update them again when someone change which modules minimal vm supports. in other words, this should be solved on another level: test execution system, test harness, wherever but in not tests. they already declared everything they depend on.

Thanks,
? Igor

> On Sep 15, 2016, at 8:12 PM, Dmitrij Pochepko <dmitrij.pochepko at oracle.com> wrote:
> 
> Hi,
> 
> these tests are using java.management module, which is incompatible with minimal vm(even if this modules is present in tested image).
> 
> Thanks,
> Dmitrij
>> Dmitrij,
>> 
>> could you please explain why those tests are applicable for minimal vm?
>> 
>> Thanks,
>> ? Igor
>> 
>>> On Sep 15, 2016, at 7:57 PM, Dmitrij Pochepko <dmitrij.pochepko at oracle.com> wrote:
>>> 
>>> Hi,
>>> 
>>> please  review small fix for 8166146 - [Testbug] update codecache tests with minimal vm filter
>>> 
>>> This patch update codecache tests which are not applicable for minimal vm with respective requires expression
>>> 
>>> 
>>> webrev: http://cr.openjdk.java.net/~dpochepk/8166146/webrev.01/
>>> 
>>> CR: https://bugs.openjdk.java.net/browse/JDK-8166146
>>> 
>>> I've tested this change on linux-x86 using minimal vm
>>> 
>>> 
>>> Thanks,
>>> 
>>> Dmitrij
>>> 
> 


From vladimir.kozlov at oracle.com  Thu Sep 15 17:49:35 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 15 Sep 2016 10:49:35 -0700
Subject: RFR: 8166146 - [Testbug] update codecache tests with minimal vm
	filter
In-Reply-To: <d7820df3-2317-55cf-8b3c-8c017a3fe479@oracle.com>
References: <d7820df3-2317-55cf-8b3c-8c017a3fe479@oracle.com>
Message-ID: <b7a05cd8-0556-7b2a-2007-a26dba12337a@oracle.com>

Good.

thanks,
Vladimir

On 9/15/16 9:57 AM, Dmitrij Pochepko wrote:
> Hi,
>
> please  review small fix for 8166146 - [Testbug] update codecache tests with minimal vm filter
>
> This patch update codecache tests which are not applicable for minimal vm with respective requires expression
>
>
> webrev: http://cr.openjdk.java.net/~dpochepk/8166146/webrev.01/
>
> CR: https://bugs.openjdk.java.net/browse/JDK-8166146
>
> I've tested this change on linux-x86 using minimal vm
>
>
> Thanks,
>
> Dmitrij
>

From vladimir.kozlov at oracle.com  Thu Sep 15 17:58:59 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 15 Sep 2016 10:58:59 -0700
Subject: RFR: 8166125: [JVMCI] Missing JVMCI flag default values
In-Reply-To: <57a99e49-68ad-8e21-0736-8fda150bb7f4@oracle.com>
References: <57a99e49-68ad-8e21-0736-8fda150bb7f4@oracle.com>
Message-ID: <6ef5a5b0-a1a1-421d-4b7e-f104819d0c61@oracle.com>

Looks good.

thanks,
Vladimir

On 9/15/16 7:10 AM, David Leopoldseder wrote:
> Hi,
>
> Please review this patch.
>
> Bug:
>     During the initial commit for the JVMCI JEP some options JVMCI sets differently than c2 have been forgotten.
> Fix:
>     Set the options if INCLUDE_JVMCI is true and -XX:+UseJVMCICompiler.
>
> http://cr.openjdk.java.net/~davleopo/JDK-8166125/webrev.001/
> https://bugs.openjdk.java.net/browse/JDK-8166125
>
> - david

From vladimir.kozlov at oracle.com  Thu Sep 15 18:05:42 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 15 Sep 2016 11:05:42 -0700
Subject: RFR(M): 8166140: C1: Possible integer overflow in
	LIRGenerator::generate_address on several platforms
In-Reply-To: <29e2b45c984248da8172cf921b7811a6@DEWDFE13DE14.global.corp.sap>
References: <29e2b45c984248da8172cf921b7811a6@DEWDFE13DE14.global.corp.sap>
Message-ID: <66073428-8ee1-ecf2-52c0-8f4af5a6e6e8@oracle.com>

Good but is is not enough.

emit_array_address() in c1_LIRGenerator_x86.cpp has the same problem.
I would suggest to look on all places where next methods are called and make sure they are correct:

LIR_Address(LIR_Opr base, intx disp, BasicType type)
LIR_Address(LIR_Opr base, LIR_Opr index, Scale scale, intx disp, BasicType type)

Thanks,
Vladimir

On 9/15/16 8:25 AM, Doerr, Martin wrote:
> Hi,
>
>
>
> as discussed with Vladimir, C1 contains code to simplify constant index/displacement addressing which uses int. However,
> int may overflow on 64 bit platforms.
>
>
>
> Please review the following webrev:
>
> http://cr.openjdk.java.net/~mdoerr/8166140_C1_int_overflow/webrev.00/
>
>
>
> I?ll also need a sponsor, please.
>
>
>
> Thanks and best regards,
>
> Martin
>
>
>

From doug.simon at oracle.com  Thu Sep 15 18:31:00 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Thu, 15 Sep 2016 20:31:00 +0200
Subject: RFR: 8166125: [JVMCI] Missing JVMCI flag default values
In-Reply-To: <57a99e49-68ad-8e21-0736-8fda150bb7f4@oracle.com>
References: <57a99e49-68ad-8e21-0736-8fda150bb7f4@oracle.com>
Message-ID: <FB0D9DB7-0A47-48DF-A483-5117F8A3BC6C@oracle.com>

I would consider moving the logic from Arguments::set_jvmci_specific_flags (in arguments.[ch]pp) to JVMCIGlobals::set_jvmci_specific_flags (in jvmci_globals.[ch]pp) to match the approach taken for check_jvmci_flags_are_consistent.


> On 15 Sep 2016, at 16:10, David Leopoldseder <david.d.leopoldseder at oracle.com> wrote:
> 
> Hi,
> 
> Please review this patch.
> 
> Bug: 
>     During the initial commit for the JVMCI JEP some options JVMCI sets differently than c2 have been forgotten.
> Fix:
>     Set the options if INCLUDE_JVMCI is true and -XX:+UseJVMCICompiler.
> 
> http://cr.openjdk.java.net/~davleopo/JDK-8166125/webrev.001/
> https://bugs.openjdk.java.net/browse/JDK-8166125
> 
> - david


From cthalinger at twitter.com  Thu Sep 15 18:34:45 2016
From: cthalinger at twitter.com (Christian Thalinger)
Date: Thu, 15 Sep 2016 08:34:45 -1000
Subject: RFR: 8166125: [JVMCI] Missing JVMCI flag default values
In-Reply-To: <57a99e49-68ad-8e21-0736-8fda150bb7f4@oracle.com>
References: <57a99e49-68ad-8e21-0736-8fda150bb7f4@oracle.com>
Message-ID: <03D19E4E-8D70-4D43-A8EC-C4F5BD9505DF@twitter.com>

Nobody noticed until now?

> On Sep 15, 2016, at 4:10 AM, David Leopoldseder <david.d.leopoldseder at oracle.com> wrote:
> 
> Hi,
> 
> Please review this patch.
> 
> Bug: 
>     During the initial commit for the JVMCI JEP some options JVMCI sets differently than c2 have been forgotten.
> Fix:
>     Set the options if INCLUDE_JVMCI is true and -XX:+UseJVMCICompiler.
> 
> http://cr.openjdk.java.net/~davleopo/JDK-8166125/webrev.001/ <http://cr.openjdk.java.net/%7Edavleopo/JDK-8166125/webrev.001/>
> https://bugs.openjdk.java.net/browse/JDK-8166125 <https://bugs.openjdk.java.net/browse/JDK-8166125>
> 
> - david

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160915/6a5927f2/attachment.html>

From doug.simon at oracle.com  Thu Sep 15 18:38:18 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Thu, 15 Sep 2016 20:38:18 +0200
Subject: RFR: 8166125: [JVMCI] Missing JVMCI flag default values
In-Reply-To: <03D19E4E-8D70-4D43-A8EC-C4F5BD9505DF@twitter.com>
References: <57a99e49-68ad-8e21-0736-8fda150bb7f4@oracle.com>
	<03D19E4E-8D70-4D43-A8EC-C4F5BD9505DF@twitter.com>
Message-ID: <139FEE17-8C56-4E3F-A289-670D34A061D1@oracle.com>

David noticed this while investigating some performance regressions that occurred around the time we switched from a separate JVMCI VM binary (i.e. COMPILERJVMCI) to -XX:+UseJVMCICompiler.

-Doug

> On 15 Sep 2016, at 20:34, Christian Thalinger <cthalinger at twitter.com> wrote:
> 
> Nobody noticed until now?
> 
>> On Sep 15, 2016, at 4:10 AM, David Leopoldseder <david.d.leopoldseder at oracle.com> wrote:
>> 
>> Hi,
>> 
>> Please review this patch.
>> 
>> Bug: 
>>     During the initial commit for the JVMCI JEP some options JVMCI sets differently than c2 have been forgotten.
>> Fix:
>>     Set the options if INCLUDE_JVMCI is true and -XX:+UseJVMCICompiler.
>> 
>> http://cr.openjdk.java.net/~davleopo/JDK-8166125/webrev.001/
>> https://bugs.openjdk.java.net/browse/JDK-8166125
>> 
>> - david
> 


From cthalinger at twitter.com  Thu Sep 15 18:40:35 2016
From: cthalinger at twitter.com (Christian Thalinger)
Date: Thu, 15 Sep 2016 08:40:35 -1000
Subject: RFR: 8166125: [JVMCI] Missing JVMCI flag default values
In-Reply-To: <139FEE17-8C56-4E3F-A289-670D34A061D1@oracle.com>
References: <57a99e49-68ad-8e21-0736-8fda150bb7f4@oracle.com>
	<03D19E4E-8D70-4D43-A8EC-C4F5BD9505DF@twitter.com>
	<139FEE17-8C56-4E3F-A289-670D34A061D1@oracle.com>
Message-ID: <49305203-6D6F-4B5D-B5E6-9530A16E669F@twitter.com>

Crazy :-)  Well, better late than never.  Looks good.

> On Sep 15, 2016, at 8:38 AM, Doug Simon <doug.simon at oracle.com> wrote:
> 
> David noticed this while investigating some performance regressions that occurred around the time we switched from a separate JVMCI VM binary (i.e. COMPILERJVMCI) to -XX:+UseJVMCICompiler.
> 
> -Doug
> 
>> On 15 Sep 2016, at 20:34, Christian Thalinger <cthalinger at twitter.com> wrote:
>> 
>> Nobody noticed until now?
>> 
>>> On Sep 15, 2016, at 4:10 AM, David Leopoldseder <david.d.leopoldseder at oracle.com> wrote:
>>> 
>>> Hi,
>>> 
>>> Please review this patch.
>>> 
>>> Bug: 
>>>    During the initial commit for the JVMCI JEP some options JVMCI sets differently than c2 have been forgotten.
>>> Fix:
>>>    Set the options if INCLUDE_JVMCI is true and -XX:+UseJVMCICompiler.
>>> 
>>> http://cr.openjdk.java.net/~davleopo/JDK-8166125/webrev.001/
>>> https://bugs.openjdk.java.net/browse/JDK-8166125
>>> 
>>> - david
>> 
> 


From vladimir.kozlov at oracle.com  Fri Sep 16 17:56:20 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 16 Sep 2016 10:56:20 -0700
Subject: RFR: 8134389: Crash in HotSpot with jvm.dll+0x42b48
	ciObjectFactory::create_new_metadata
In-Reply-To: <7c1a8b01-b4ec-ea23-b59a-500c1bfd5dbc@oracle.com>
References: <bad02f78-e54b-b61e-72db-f435c02c5e10@oracle.com>
	<05c82c51-9525-eec7-206e-a265c7d47194@oracle.com>
	<7c1a8b01-b4ec-ea23-b59a-500c1bfd5dbc@oracle.com>
Message-ID: <57DC3244.7050300@oracle.com>

Add assert (we have it in other places too):

       assert(declared_signature != NULL, "cannot be null");

For functionality correctness ask Vladimir Ivanov to look.

Thanks,
Vladimir

On 9/11/16 4:51 AM, Jamsheed C m wrote:
> i made some changes to my fix. webrev is updated in place.
>
> pit results with latest modification updated in bug(not still completed)
>
> Best Regards,
>
> Jamsheed
>
>
> On 9/10/2016 3:53 AM, Jamsheed C m wrote:
>>
>> adding a little more description as per my understanding
>>
>> This issue can happen only for compiled lforms not inlined case
>>
>> there are two scenarios.
>> 1) no compiled lforms inlined
>> 2) some compiled lforms are inlined or  final method is not inlined (linkTo* not inlined)..    (i.e partially inlined)
>>
>> in all these cases *Invoke instruction* will be *return Value*. and will have erased type.
>> so we reify return type either by type casting(for partially inlined case) or by directly pulling from callsite MT.
>>
>> Best Regards,
>>
>> Jamsheed
>>
>>
>> On 9/8/2016 3:26 PM, Jamsheed C m wrote:
>>> Hi All,
>>>
>>> bugid: https://bugs.openjdk.java.net/browse/JDK-8134389
>>>
>>> webrev: http://cr.openjdk.java.net/~jcm/8134389/webrev.00/
>>>
>>> return type information is not available in lforms, this causes contradictions in operation like  store indexed. mh _linkTo* site arg type casting. etc..
>>>
>>> fix: TypeCast to declared return type at lform return.
>>>
>>> Best Regards,
>>>
>>> Jamsheed
>>>
>>
>

From vladimir.kozlov at oracle.com  Fri Sep 16 18:04:19 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 16 Sep 2016 11:04:19 -0700
Subject: RFR: 8155219 - [TESTBUG] Rewrite compiler/ciReplay/TestVM.sh in
	java
In-Reply-To: <a764d97b-7015-d87c-a7c8-3031e1f2762e@oracle.com>
References: <a764d97b-7015-d87c-a7c8-3031e1f2762e@oracle.com>
Message-ID: <57DC3423.4020707@oracle.com>

Thanks you for doing this. Looks good. I assume vm.debug is true for both builds: fastdebug and slowdebug.

Thanks,
Vladimir

On 9/8/16 7:48 AM, Dmitrij Pochepko wrote:
> Hi,
>
> please review fix for 8155219 - [TESTBUG] Rewrite compiler/ciReplay/TestVM.sh in java
>
> compiler/ciReoplay/* tests were ported from shell to java.
>
>
>
> CR: https://bugs.openjdk.java.net/browse/JDK-8155219
>
> webrev for root level: http://cr.openjdk.java.net/~dpochepk/8155219/webrev.root.01/
>
> webrev for hotspot: http://cr.openjdk.java.net/~dpochepk/8155219/webrev.01/
>
>
> I've tested it via rbt.
>
> Thanks,
>
> Dmitrij
>

From dmitrij.pochepko at oracle.com  Fri Sep 16 18:44:22 2016
From: dmitrij.pochepko at oracle.com (Dmitrij Pochepko)
Date: Fri, 16 Sep 2016 21:44:22 +0300
Subject: RFR: 8155219 - [TESTBUG] Rewrite compiler/ciReplay/TestVM.sh in
	java
In-Reply-To: <a764d97b-7015-d87c-a7c8-3031e1f2762e@oracle.com>
References: <a764d97b-7015-d87c-a7c8-3031e1f2762e@oracle.com>
Message-ID: <05897ff9-f355-972a-7755-0edb03d3ba4e@oracle.com>

Hi,

can somebody take a look?

Thanks,

Dmitrij

On 08.09.2016 17:48, Dmitrij Pochepko wrote:
> Hi,
>
> please review fix for 8155219 - [TESTBUG] Rewrite 
> compiler/ciReplay/TestVM.sh in java
>
> compiler/ciReoplay/* tests were ported from shell to java.
>
>
>
> CR: https://bugs.openjdk.java.net/browse/JDK-8155219
>
> webrev for root level: 
> http://cr.openjdk.java.net/~dpochepk/8155219/webrev.root.01/
>
> webrev for hotspot: 
> http://cr.openjdk.java.net/~dpochepk/8155219/webrev.01/
>
>
> I've tested it via rbt.
>
> Thanks,
>
> Dmitrij
>


From vladimir.kozlov at oracle.com  Sat Sep 17 01:17:56 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 16 Sep 2016 18:17:56 -0700
Subject: [9] RFR[XS] 8166096: variable tracking size limit exceeded in
	jvmciCompilerToVM.cpp
Message-ID: <57DC99C4.4030701@oracle.com>

https://bugs.openjdk.java.net/browse/JDK-8166096

+++ b/make/lib/JvmOverrideFiles.gmk	Fri Sep 16 18:14:06 2016 -0700
@@ -31,6 +31,7 @@

  ifeq ($(TOOLCHAIN_TYPE), gcc)
    BUILD_LIBJVM_vmStructs.cpp_CXXFLAGS := -fno-var-tracking-assignments -O0
+  BUILD_LIBJVM_jvmciCompilerToVM.cpp_CXXFLAGS := -fno-var-tracking-assignments
  endif

  ifeq ($(OPENJDK_TARGET_OS), linux)


Remove annoying Hotspot compilation warning:

hotspot/src/share/vm/jvmci/jvmciCompilerToVM.cpp: In static member function 'static objArrayHandle CompilerToVM::initialize_intrinsics(Thread*)':
hotspot/src/share/vm/jvmci/jvmciCompilerToVM.cpp:206:16: note: variable tracking size limit exceeded with -fvar-tracking-assignments, retrying without
  objArrayHandle CompilerToVM::initialize_intrinsics(TRAPS) {

Thanks,
Vladimir

From dean.long at oracle.com  Sat Sep 17 03:28:39 2016
From: dean.long at oracle.com (dean.long at oracle.com)
Date: Fri, 16 Sep 2016 20:28:39 -0700
Subject: [9] RFR[XS] 8166096: variable tracking size limit exceeded in
	jvmciCompilerToVM.cpp
In-Reply-To: <57DC99C4.4030701@oracle.com>
References: <57DC99C4.4030701@oracle.com>
Message-ID: <9721ae67-e2f8-f5fb-0078-5c45b90c3040@oracle.com>

Good.

dl


On 9/16/16 6:17 PM, Vladimir Kozlov wrote:
> https://bugs.openjdk.java.net/browse/JDK-8166096
>
> +++ b/make/lib/JvmOverrideFiles.gmk    Fri Sep 16 18:14:06 2016 -0700
> @@ -31,6 +31,7 @@
>
>  ifeq ($(TOOLCHAIN_TYPE), gcc)
>    BUILD_LIBJVM_vmStructs.cpp_CXXFLAGS := 
> -fno-var-tracking-assignments -O0
> +  BUILD_LIBJVM_jvmciCompilerToVM.cpp_CXXFLAGS := 
> -fno-var-tracking-assignments
>  endif
>
>  ifeq ($(OPENJDK_TARGET_OS), linux)
>
>
> Remove annoying Hotspot compilation warning:
>
> hotspot/src/share/vm/jvmci/jvmciCompilerToVM.cpp: In static member 
> function 'static objArrayHandle 
> CompilerToVM::initialize_intrinsics(Thread*)':
> hotspot/src/share/vm/jvmci/jvmciCompilerToVM.cpp:206:16: note: 
> variable tracking size limit exceeded with -fvar-tracking-assignments, 
> retrying without
>  objArrayHandle CompilerToVM::initialize_intrinsics(TRAPS) {
>
> Thanks,
> Vladimir


From vladimir.kozlov at oracle.com  Sat Sep 17 04:09:00 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 16 Sep 2016 21:09:00 -0700
Subject: [9] RFR[XS] 8166096: variable tracking size limit exceeded in
	jvmciCompilerToVM.cpp
In-Reply-To: <9721ae67-e2f8-f5fb-0078-5c45b90c3040@oracle.com>
References: <57DC99C4.4030701@oracle.com>
	<9721ae67-e2f8-f5fb-0078-5c45b90c3040@oracle.com>
Message-ID: <57DCC1DC.9000007@oracle.com>

Thank you, Dean

Vladimir

On 9/16/16 8:28 PM, dean.long at oracle.com wrote:
> Good.
>
> dl
>
>
> On 9/16/16 6:17 PM, Vladimir Kozlov wrote:
>> https://bugs.openjdk.java.net/browse/JDK-8166096
>>
>> +++ b/make/lib/JvmOverrideFiles.gmk    Fri Sep 16 18:14:06 2016 -0700
>> @@ -31,6 +31,7 @@
>>
>>  ifeq ($(TOOLCHAIN_TYPE), gcc)
>>    BUILD_LIBJVM_vmStructs.cpp_CXXFLAGS := -fno-var-tracking-assignments -O0
>> +  BUILD_LIBJVM_jvmciCompilerToVM.cpp_CXXFLAGS := -fno-var-tracking-assignments
>>  endif
>>
>>  ifeq ($(OPENJDK_TARGET_OS), linux)
>>
>>
>> Remove annoying Hotspot compilation warning:
>>
>> hotspot/src/share/vm/jvmci/jvmciCompilerToVM.cpp: In static member function 'static objArrayHandle CompilerToVM::initialize_intrinsics(Thread*)':
>> hotspot/src/share/vm/jvmci/jvmciCompilerToVM.cpp:206:16: note: variable tracking size limit exceeded with -fvar-tracking-assignments, retrying without
>>  objArrayHandle CompilerToVM::initialize_intrinsics(TRAPS) {
>>
>> Thanks,
>> Vladimir
>

From HORII at jp.ibm.com  Sun Sep 18 17:36:27 2016
From: HORII at jp.ibm.com (Hiroshi H Horii)
Date: Sun, 18 Sep 2016 17:36:27 +0000
Subject: RFR(m) 8164920: ppc: enhancement of CRC32 intrinsic
In-Reply-To: <fdca33bc870f47e89768ef6cd96aba7f@DEWDFE13DE14.global.corp.sap>
References: <OF3DBB7B4B.62D23BD5-ON49258026.0031B271-49258026.0051807D@notes.na.collabserv.com>
	<fdca33bc870f47e89768ef6cd96aba7f@DEWDFE13DE14.global.corp.sap>
Message-ID: <OF069946E4.2F8AAC40-ON49258032.006041F6-49258032.0060B886@notes.na.collabserv.com>

Hi Martin, and all
(Please allow me to send this mail twice. The first mail is awaiting 
because it exceeded 100KB)

Thank you for your reviewing. Gustavo and I recreated a new change based 
on your comments. I would like to request a review again.

My account of cr server is not available now (because of my mistake...) 
and Gustavo cannot create a webrev file with another reason. I would like 
to attach a diff file created with "hg diff -g" in hotspot.  If possible, 
could someone create a webrev file with this changeset?


Regards,
Hiroshi
-----------------------
Hiroshi Horii, Ph.D.
IBM Research - Tokyo

"Doerr, Martin" <martin.doerr at sap.com> wrote on 09/13/2016 18:35:09:

> From: "Doerr, Martin" <martin.doerr at sap.com>
> To: Hiroshi H Horii/Japan/IBM at IBMJP, "hotspot-compiler-
> dev at openjdk.java.net" <hotspot-compiler-dev at openjdk.java.net>
> Cc: "Volker Simonis (volker.simonis at gmail.com)" 
> <volker.simonis at gmail.com>, Gustavo Bueno Romero <gromero at br.ibm.com>
> Date: 09/13/2016 18:36
> Subject: RE: RFR(m) 8164920: ppc: enhancement of CRC32 intrinsic
> 
> Hi Hiroshi,
> 
> we appreciate your change. Thanks for contributing it.
> It basically looks good, but I?d like to propose some minor 
improvements.
> 
> 
> kernel_crc32_1word_vpmsumd:
> 
> 1. The Pre-align code can be implemented shorter:
>     clrldi_(prealign, buf, 57);
>     beq(CCR0, L_alignHead);
> 
>     subfic(prealign, prealign, 128);
> 
> 2. I'd prefer the label name ?L_alignedHead?.
> 
> 3. The branch b(L_alignTail) and the label are not needed and should
> get removed.
> 
> 
> kernel_crc32_1word_aligned:
> 
> 1. When saving and restoring non-volatile vector register, please 
> use offset differences of -16 instead of -32.
> (The ABI allows up to 288 bytes to be used in frameless functions so
> it will fit if -16 is used.)
> 
> 2. The std instructions should better be used with int offsets so 
> you can get rid of the addi(offset, offset, -8) instructions.
> 
> 
> Comments:
> For single line comments "//" should be used instead of "/*". Would 
> be nice if you could change them.
> 
> 
> Thanks and best regards,
> Martin
> 
> 
> From: Hiroshi H Horii [mailto:HORII at jp.ibm.com] 
> Sent: Dienstag, 6. September 2016 16:50
> To: hotspot-compiler-dev at openjdk.java.net; vladimir.kozlov at oracle.com
> Cc: Volker Simonis (volker.simonis at gmail.com) 
> <volker.simonis at gmail.com>; Doerr, Martin <martin.doerr at sap.com>; 
> Gustavo Bueno Romero <gromero at br.ibm.com>
> Subject: RFR(m) 8164920: ppc: enhancement of CRC32 intrinsic
> 
> Dear Vladimir and all:
> 
> Can I please request reviews for the following change?
> 
> JIRA: https://bugs.openjdk.java.net/browse/JDK-8164920
> webrev: http://cr.openjdk.java.net/~gromero/8164920/01/
> 
> As Volker's comments in the above JIRA, this is a ppc64-only 
> improvement which will not 
> affect any of the Oracle platforms in any way. 
> 
> This change includes new implementation of CRC32 Intrinsics for ppc64le.
> In my local experiment, CRC32 of 64KB was calculated more than 20 
> times faster than original. 
> Performance of CRC32 Intrinsic is important to run recent Apache 
Cassandra.
> A Cassandra daemon needs to read 64KB data from a disk with CRC32 
> checksum by default.
> 
> This JIRA entry has "jdk9-fc-request" label.
> If there is a chance to include new change in JDK 9 for ppc64le, I 
> would like to request 
> a review for this change.
> 
> Regards,
> Hiroshi
> -----------------------
> Hiroshi Horii, Ph.D.
> IBM Research - Tokyo

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160918/179889f0/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: hotspot.crc32.20160918.changeset
Type: application/octet-stream
Size: 60026 bytes
Desc: not available
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160918/179889f0/hotspot.crc32.20160918-0001.changeset>

From david.holmes at oracle.com  Mon Sep 19 05:01:47 2016
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 19 Sep 2016 15:01:47 +1000
Subject: [9] RFR[XS] 8166096: variable tracking size limit exceeded in
	jvmciCompilerToVM.cpp
In-Reply-To: <57DC99C4.4030701@oracle.com>
References: <57DC99C4.4030701@oracle.com>
Message-ID: <5157ce5a-2b55-22e8-dd88-68bbfc237dae@oracle.com>

On 17/09/2016 11:17 AM, Vladimir Kozlov wrote:
> https://bugs.openjdk.java.net/browse/JDK-8166096
>
> +++ b/make/lib/JvmOverrideFiles.gmk    Fri Sep 16 18:14:06 2016 -0700
> @@ -31,6 +31,7 @@
>
>  ifeq ($(TOOLCHAIN_TYPE), gcc)
>    BUILD_LIBJVM_vmStructs.cpp_CXXFLAGS := -fno-var-tracking-assignments -O0
> +  BUILD_LIBJVM_jvmciCompilerToVM.cpp_CXXFLAGS :=
> -fno-var-tracking-assignments
>  endif
>
>  ifeq ($(OPENJDK_TARGET_OS), linux)
>
>
> Remove annoying Hotspot compilation warning:

Seems reasonable as a short term silencer, but ... does it imply the 
code needs to be changed somehow?

Thanks,
David

> hotspot/src/share/vm/jvmci/jvmciCompilerToVM.cpp: In static member
> function 'static objArrayHandle
> CompilerToVM::initialize_intrinsics(Thread*)':
> hotspot/src/share/vm/jvmci/jvmciCompilerToVM.cpp:206:16: note: variable
> tracking size limit exceeded with -fvar-tracking-assignments, retrying
> without
>  objArrayHandle CompilerToVM::initialize_intrinsics(TRAPS) {
>
> Thanks,
> Vladimir

From igor.ignatyev at oracle.com  Mon Sep 19 09:38:00 2016
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Mon, 19 Sep 2016 12:38:00 +0300
Subject: RFR(XS) : 8166164 :
	compiler/compilercontrol/share/processors/LogProcessor.java
	does not close Scanner
Message-ID: <DFAD4836-5DDC-454B-99E4-9664364BF46D@oracle.com>

http://cr.openjdk.java.net/~iignatyev/8166164/webrev.00/
> 16 lines changed: 2 ins; 0 del; 14 mod;

Hi all,

could you please review this small patch which fixes resource leak in compiler/compilercontrol tests? 
LogProcessor::getScanner creates a new Scanner, but there is no code which closes it. This leak leads to 'failed to clean up files after test? error from jtreg.

the fix was tested by running :hotspot_compiler test group.

JBS: https://bugs.openjdk.java.net/browse/JDK-8166164
webrev: http://cr.openjdk.java.net/~iignatyev/8166164/webrev.00/

Thanks,
? Igor


From tobias.hartmann at oracle.com  Mon Sep 19 12:15:04 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Mon, 19 Sep 2016 14:15:04 +0200
Subject: [9] RFR(S): 8166046: [TESTBUG]
	compiler/stringopts/TestStringObjectInitialization.java fails with OOME
Message-ID: <57DFD6C8.2080508@oracle.com>

Hi,

please review the following patch:
https://bugs.openjdk.java.net/browse/JDK-8166046
http://cr.openjdk.java.net/~thartmann/8166046/webrev.00/

The test creates 101 threads that each execute a loop with 10.000 iterations that append to a String another String of size 17. This results in a String of size 101 * 10.000 * 17 = 17.170.000 ( = 35 MB). In the failing cases, the test is executed on 32-bit Windows with -Xcomp and -XX:+DeoptimizeALot which increase memory consumption of the VM due to extensive (re-)compilation, deoptimization and re-allocation. The test fails because there is not enough heap space to hold the String. 

I reduced the number of threads to 32+1 and the number of per-thread iterations to 1000. I verified that this fixes the OOMEs on the failing 32-bit Windows machines.

Thanks,
Tobias


From pavel.punegov at oracle.com  Mon Sep 19 12:18:24 2016
From: pavel.punegov at oracle.com (Pavel Punegov)
Date: Mon, 19 Sep 2016 15:18:24 +0300
Subject: RFR(XS) : 8166164 :
	compiler/compilercontrol/share/processors/LogProcessor.java
	does not close Scanner
In-Reply-To: <DFAD4836-5DDC-454B-99E4-9664364BF46D@oracle.com>
References: <DFAD4836-5DDC-454B-99E4-9664364BF46D@oracle.com>
Message-ID: <CB865A93-337C-4377-85A8-DF91876FC084@oracle.com>

Hi Igor,

the change looks good. Thanks for fixing.

? Pavel.

> On 19 Sep 2016, at 12:38, Igor Ignatyev <igor.ignatyev at oracle.com> wrote:
> 
> http://cr.openjdk.java.net/~iignatyev/8166164/webrev.00/
>> 16 lines changed: 2 ins; 0 del; 14 mod;
> 
> Hi all,
> 
> could you please review this small patch which fixes resource leak in compiler/compilercontrol tests? 
> LogProcessor::getScanner creates a new Scanner, but there is no code which closes it. This leak leads to 'failed to clean up files after test? error from jtreg.
> 
> the fix was tested by running :hotspot_compiler test group.
> 
> JBS: https://bugs.openjdk.java.net/browse/JDK-8166164
> webrev: http://cr.openjdk.java.net/~iignatyev/8166164/webrev.00/
> 
> Thanks,
> ? Igor
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160919/0337df60/attachment.html>

From martin.doerr at sap.com  Mon Sep 19 13:47:50 2016
From: martin.doerr at sap.com (Doerr, Martin)
Date: Mon, 19 Sep 2016 13:47:50 +0000
Subject: RFR(M): 8166140: C1: Possible integer overflow in
	LIRGenerator::generate_address on several platforms
In-Reply-To: <66073428-8ee1-ecf2-52c0-8f4af5a6e6e8@oracle.com>
References: <29e2b45c984248da8172cf921b7811a6@DEWDFE13DE14.global.corp.sap>
	<66073428-8ee1-ecf2-52c0-8f4af5a6e6e8@oracle.com>
Message-ID: <73f98e3882bd46dab427a02de68a1b93@DEWDFE13DE14.global.corp.sap>

Hi Vladimir,

you're right. I have fixed that too in the new webrev:
http://cr.openjdk.java.net/~mdoerr/8166140_C1_int_overflow/webrev.01/

The 2 LIR_Address constructors you have mentioned don't have many users. The other ones look ok.

Thanks and best regards,
Martin

-----Original Message-----
From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] 
Sent: Donnerstag, 15. September 2016 20:06
To: hotspot-compiler-dev at openjdk.java.net
Cc: Doerr, Martin <martin.doerr at sap.com>
Subject: Re: RFR(M): 8166140: C1: Possible integer overflow in LIRGenerator::generate_address on several platforms

Good but is is not enough.

emit_array_address() in c1_LIRGenerator_x86.cpp has the same problem.
I would suggest to look on all places where next methods are called and make sure they are correct:

LIR_Address(LIR_Opr base, intx disp, BasicType type)
LIR_Address(LIR_Opr base, LIR_Opr index, Scale scale, intx disp, BasicType type)

Thanks,
Vladimir

On 9/15/16 8:25 AM, Doerr, Martin wrote:
> Hi,
>
>
>
> as discussed with Vladimir, C1 contains code to simplify constant index/displacement addressing which uses int. However,
> int may overflow on 64 bit platforms.
>
>
>
> Please review the following webrev:
>
> http://cr.openjdk.java.net/~mdoerr/8166140_C1_int_overflow/webrev.00/
>
>
>
> I'll also need a sponsor, please.
>
>
>
> Thanks and best regards,
>
> Martin
>
>
>

From HORII at jp.ibm.com  Sun Sep 18 15:00:57 2016
From: HORII at jp.ibm.com (Hiroshi H Horii)
Date: Sun, 18 Sep 2016 15:00:57 +0000
Subject: RFR(m) 8164920: ppc: enhancement of CRC32 intrinsic
In-Reply-To: <fdca33bc870f47e89768ef6cd96aba7f@DEWDFE13DE14.global.corp.sap>
References: <OF3DBB7B4B.62D23BD5-ON49258026.0031B271-49258026.0051807D@notes.na.collabserv.com>
	<fdca33bc870f47e89768ef6cd96aba7f@DEWDFE13DE14.global.corp.sap>
Message-ID: <OF88C38CD0.C57B6C11-ON49258032.004FBB7A-49258032.00527C19@notes.na.collabserv.com>

Hi Martin, and all

Thank you for your reviewing. Gustavo and I recreated a new change based 
on your comments. I would like to review it again.

My account of cr server is not available now (because of my mistake...) 
and Gustavo cannot create a webrev file with another reason. I would like 
to attach a diff file created with "hg diff -g" in hotspot.  If possible, 
could someone create a webrev file with this changeset?

I also attach a test program for CRC32 Intrinsic.


Regards,
Hiroshi
-----------------------
Hiroshi Horii, Ph.D.
IBM Research - Tokyo


From:   "Doerr, Martin" <martin.doerr at sap.com>
To:     Hiroshi H Horii/Japan/IBM at IBMJP, 
"hotspot-compiler-dev at openjdk.java.net" 
<hotspot-compiler-dev at openjdk.java.net>
Cc:     "Volker Simonis (volker.simonis at gmail.com)" 
<volker.simonis at gmail.com>, Gustavo Bueno Romero <gromero at br.ibm.com>
Date:   09/13/2016 18:36
Subject:        RE: RFR(m) 8164920: ppc: enhancement of CRC32 intrinsic


Hi Hiroshi,
 
we appreciate your change. Thanks for contributing it.
It basically looks good, but I?d like to propose some minor improvements.
 
 
kernel_crc32_1word_vpmsumd:
 
1. The Pre-align code can be implemented shorter:
    clrldi_(prealign, buf, 57);
    beq(CCR0, L_alignHead);
 
    subfic(prealign, prealign, 128);
 
2. I'd prefer the label name ?L_alignedHead?.
 
3. The branch b(L_alignTail) and the label are not needed and should get 
removed.
 
 
kernel_crc32_1word_aligned:
 
1. When saving and restoring non-volatile vector register, please use 
offset differences of -16 instead of -32.
(The ABI allows up to 288 bytes to be used in frameless functions so it 
will fit if -16 is used.)
 
2. The std instructions should better be used with int offsets so you can 
get rid of the addi(offset, offset, -8) instructions.
 
 
Comments:
For single line comments "//" should be used instead of "/*". Would be 
nice if you could change them.
 
 
Thanks and best regards,
Martin
 
 
From: Hiroshi H Horii [mailto:HORII at jp.ibm.com] 
Sent: Dienstag, 6. September 2016 16:50
To: hotspot-compiler-dev at openjdk.java.net; vladimir.kozlov at oracle.com
Cc: Volker Simonis (volker.simonis at gmail.com) <volker.simonis at gmail.com>; 
Doerr, Martin <martin.doerr at sap.com>; Gustavo Bueno Romero 
<gromero at br.ibm.com>
Subject: RFR(m) 8164920: ppc: enhancement of CRC32 intrinsic
 
Dear Vladimir and all:

Can I please request reviews for the following change?

JIRA: https://bugs.openjdk.java.net/browse/JDK-8164920
webrev: http://cr.openjdk.java.net/~gromero/8164920/01/

As Volker's comments in the above JIRA, this is a ppc64-only improvement 
which will not 
affect any of the Oracle platforms in any way. 

This change includes new implementation of CRC32 Intrinsics for ppc64le.
In my local experiment, CRC32 of 64KB was calculated more than 20 times 
faster than original. 
Performance of CRC32 Intrinsic is important to run recent Apache 
Cassandra.
A Cassandra daemon needs to read 64KB data from a disk with CRC32 checksum 
by default.

This JIRA entry has "jdk9-fc-request" label.
If there is a chance to include new change in JDK 9 for ppc64le, I would 
like to request 
a review for this change.

Regards,
Hiroshi
-----------------------
Hiroshi Horii, Ph.D.
IBM Research - Tokyo


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160918/c0595e0e/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: CRC32Test.java
Type: application/octet-stream
Size: 8738 bytes
Desc: not available
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160918/c0595e0e/CRC32Test-0001.java>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: hotspot.crc32.20160918.changeset
Type: application/octet-stream
Size: 60026 bytes
Desc: not available
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160918/c0595e0e/hotspot.crc32.20160918-0001.changeset>

From erik.joelsson at oracle.com  Mon Sep 19 07:58:40 2016
From: erik.joelsson at oracle.com (Erik Joelsson)
Date: Mon, 19 Sep 2016 09:58:40 +0200
Subject: [9] RFR[XS] 8166096: variable tracking size limit exceeded in
	jvmciCompilerToVM.cpp
In-Reply-To: <57DC99C4.4030701@oracle.com>
References: <57DC99C4.4030701@oracle.com>
Message-ID: <daec6089-b9e5-cf04-2b2f-a2eea2d1b319@oracle.com>

Looks ok to me.

/Erik


On 2016-09-17 03:17, Vladimir Kozlov wrote:
> https://bugs.openjdk.java.net/browse/JDK-8166096
>
> +++ b/make/lib/JvmOverrideFiles.gmk    Fri Sep 16 18:14:06 2016 -0700
> @@ -31,6 +31,7 @@
>
>  ifeq ($(TOOLCHAIN_TYPE), gcc)
>    BUILD_LIBJVM_vmStructs.cpp_CXXFLAGS := 
> -fno-var-tracking-assignments -O0
> +  BUILD_LIBJVM_jvmciCompilerToVM.cpp_CXXFLAGS := 
> -fno-var-tracking-assignments
>  endif
>
>  ifeq ($(OPENJDK_TARGET_OS), linux)
>
>
> Remove annoying Hotspot compilation warning:
>
> hotspot/src/share/vm/jvmci/jvmciCompilerToVM.cpp: In static member 
> function 'static objArrayHandle 
> CompilerToVM::initialize_intrinsics(Thread*)':
> hotspot/src/share/vm/jvmci/jvmciCompilerToVM.cpp:206:16: note: 
> variable tracking size limit exceeded with -fvar-tracking-assignments, 
> retrying without
>  objArrayHandle CompilerToVM::initialize_intrinsics(TRAPS) {
>
> Thanks,
> Vladimir


From vladimir.x.ivanov at oracle.com  Mon Sep 19 16:38:54 2016
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Mon, 19 Sep 2016 19:38:54 +0300
Subject: RFR: 8134389: Crash in HotSpot with jvm.dll+0x42b48
	ciObjectFactory::create_new_metadata
In-Reply-To: <7c1a8b01-b4ec-ea23-b59a-500c1bfd5dbc@oracle.com>
References: <bad02f78-e54b-b61e-72db-f435c02c5e10@oracle.com>
	<05c82c51-9525-eec7-206e-a265c7d47194@oracle.com>
	<7c1a8b01-b4ec-ea23-b59a-500c1bfd5dbc@oracle.com>
Message-ID: <3d9039ee-84a2-d4e8-e36a-08f12c4cd504@oracle.com>

Overall, the fix looks good.

Some nitpicks:
  (1) I'd prefer to avoid using ciMethod::is_compiled_lambda_form();
  (2) align with other uses of TypeCast for method handles.

Also, ciType::is_klass() can be replaced with 
!ciType::is_primitive_type() check, but IMO it doesn't matter much.

Something like the following:

diff --git a/src/share/vm/c1/c1_GraphBuilder.cpp 
b/src/share/vm/c1/c1_GraphBuilder.cpp
--- a/src/share/vm/c1/c1_GraphBuilder.cpp
+++ b/src/share/vm/c1/c1_GraphBuilder.cpp
@@ -1493,6 +1493,24 @@
    // Check to see whether we are inlining. If so, Return
    // instructions become Gotos to the continuation point.
    if (continuation() != NULL) {
+
+    int invoke_bci = state()->caller_state()->bci();
+
+    if (x != NULL && !ignore_return) {
+      ciMethod* caller = state()->scope()->caller()->method();
+      Bytecodes::Code invoke_raw_bc = caller->raw_code_at_bci(invoke_bci);
+      if (invoke_raw_bc == Bytecodes::_invokehandle ||
+          invoke_raw_bc == Bytecodes::_invokedynamic) {
+        ciType* declared_ret_type = 
caller->get_declared_signature_at_bci(invoke_bci)->return_type();
+        if (declared_ret_type->is_klass() &&
+            x->exact_type() == NULL &&
+            x->declared_type() != declared_ret_type &&
+            declared_ret_type != compilation()->env()->Object_klass()) {
+          x = append(new TypeCast(declared_ret_type->as_klass(), x, 
copy_state_before()));
+        }
+      }
+    }
+
      assert(!method()->is_synchronized() || InlineSynchronizedMethods, 
"can not inline synchronized methods yet");

      if (compilation()->env()->dtrace_method_probes()) {
@@ -1516,7 +1534,6 @@
      // State at end of inlined method is the state of the caller
      // without the method parameters on stack, including the
      // return value, if any, of the inlined method on operand stack.
-    int invoke_bci = state()->caller_state()->bci();
      set_state(state()->caller_state()->copy_for_parsing());
      if (x != NULL) {
        if (!ignore_return) {
diff --git a/src/share/vm/c1/c1_Instruction.cpp 
b/src/share/vm/c1/c1_Instruction.cpp
--- a/src/share/vm/c1/c1_Instruction.cpp
+++ b/src/share/vm/c1/c1_Instruction.cpp
@@ -360,7 +360,8 @@
  }

  ciType* Invoke::declared_type() const {
-  ciType *t = _target->signature()->return_type();
+  ciSignature* declared_signature = 
state()->scope()->method()->get_declared_signature_at_bci(state()->bci());
+  ciType *t = declared_signature->return_type();
    assert(t->basic_type() != T_VOID, "need return value of void method?");
    return t;
  }
diff --git a/src/share/vm/ci/ciMethod.hpp b/src/share/vm/ci/ciMethod.hpp
--- a/src/share/vm/ci/ciMethod.hpp
+++ b/src/share/vm/ci/ciMethod.hpp
@@ -255,6 +255,12 @@
      ciSignature* ignored_declared_signature;
      return get_method_at_bci(bci, ignored_will_link, 
&ignored_declared_signature);
    }
+  ciSignature*  get_declared_signature_at_bci(int bci) {
+    bool ignored_will_link;
+    ciSignature* declared_signature;
+    get_method_at_bci(bci, ignored_will_link, &declared_signature);
+    return declared_signature;
+  }

    // Given a certain calling environment, find the monomorphic target
    // for the call.  Return NULL if the call is not monomorphic in

Best regards,
Vladimir Ivanov

On 9/11/16 2:51 PM, Jamsheed C m wrote:
> i made some changes to my fix. webrev is updated in place.
>
> pit results with latest modification updated in bug(not still completed)
>
> Best Regards,
>
> Jamsheed
>
>
> On 9/10/2016 3:53 AM, Jamsheed C m wrote:
>>
>> adding a little more description as per my understanding
>>
>> This issue can happen only for compiled lforms not inlined case
>>
>> there are two scenarios.
>> 1) no compiled lforms inlined
>> 2) some compiled lforms are inlined or  final method is not inlined
>> (linkTo* not inlined)..    (i.e partially inlined)
>>
>> in all these cases *Invoke instruction* will be *return Value*. and
>> will have erased type.
>> so we reify return type either by type casting(for partially inlined
>> case) or by directly pulling from callsite MT.
>>
>> Best Regards,
>>
>> Jamsheed
>>
>>
>> On 9/8/2016 3:26 PM, Jamsheed C m wrote:
>>> Hi All,
>>>
>>> bugid: https://bugs.openjdk.java.net/browse/JDK-8134389
>>>
>>> webrev: http://cr.openjdk.java.net/~jcm/8134389/webrev.00/
>>>
>>> return type information is not available in lforms, this causes
>>> contradictions in operation like  store indexed. mh _linkTo* site arg
>>> type casting. etc..
>>>
>>> fix: TypeCast to declared return type at lform return.
>>>
>>> Best Regards,
>>>
>>> Jamsheed
>>>
>>
>

From vladimir.kozlov at oracle.com  Mon Sep 19 17:10:00 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 19 Sep 2016 10:10:00 -0700
Subject: RFR(M): 8166140: C1: Possible integer overflow in
	LIRGenerator::generate_address on several platforms
In-Reply-To: <73f98e3882bd46dab427a02de68a1b93@DEWDFE13DE14.global.corp.sap>
References: <29e2b45c984248da8172cf921b7811a6@DEWDFE13DE14.global.corp.sap>
	<66073428-8ee1-ecf2-52c0-8f4af5a6e6e8@oracle.com>
	<73f98e3882bd46dab427a02de68a1b93@DEWDFE13DE14.global.corp.sap>
Message-ID: <a11870a3-1de9-2d5b-0498-39eb7c0357df@oracle.com>

This looks good.

Thanks,
Vladimir

On 9/19/16 6:47 AM, Doerr, Martin wrote:
> Hi Vladimir,
>
> you're right. I have fixed that too in the new webrev:
> http://cr.openjdk.java.net/~mdoerr/8166140_C1_int_overflow/webrev.01/
>
> The 2 LIR_Address constructors you have mentioned don't have many users. The other ones look ok.
>
> Thanks and best regards,
> Martin
>
> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Donnerstag, 15. September 2016 20:06
> To: hotspot-compiler-dev at openjdk.java.net
> Cc: Doerr, Martin <martin.doerr at sap.com>
> Subject: Re: RFR(M): 8166140: C1: Possible integer overflow in LIRGenerator::generate_address on several platforms
>
> Good but is is not enough.
>
> emit_array_address() in c1_LIRGenerator_x86.cpp has the same problem.
> I would suggest to look on all places where next methods are called and make sure they are correct:
>
> LIR_Address(LIR_Opr base, intx disp, BasicType type)
> LIR_Address(LIR_Opr base, LIR_Opr index, Scale scale, intx disp, BasicType type)
>
> Thanks,
> Vladimir
>
> On 9/15/16 8:25 AM, Doerr, Martin wrote:
>> Hi,
>>
>>
>>
>> as discussed with Vladimir, C1 contains code to simplify constant index/displacement addressing which uses int. However,
>> int may overflow on 64 bit platforms.
>>
>>
>>
>> Please review the following webrev:
>>
>> http://cr.openjdk.java.net/~mdoerr/8166140_C1_int_overflow/webrev.00/
>>
>>
>>
>> I'll also need a sponsor, please.
>>
>>
>>
>> Thanks and best regards,
>>
>> Martin
>>
>>
>>

From vladimir.kozlov at oracle.com  Mon Sep 19 17:24:48 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 19 Sep 2016 10:24:48 -0700
Subject: [9] RFR[XS] 8166096: variable tracking size limit exceeded in
	jvmciCompilerToVM.cpp
In-Reply-To: <5157ce5a-2b55-22e8-dd88-68bbfc237dae@oracle.com>
References: <57DC99C4.4030701@oracle.com>
	<5157ce5a-2b55-22e8-dd88-68bbfc237dae@oracle.com>
Message-ID: <e4624882-7b8a-5706-6836-88e9aa9888fc@oracle.com>

Thank you, David, for review.

I don't think we can do anything more here. It is number of intrinsics which overflow some internal buffer related to 
var-tracking-assignments in gcc during compilation:

   VM_INTRINSICS_DO(VM_INTRINSIC_INFO, VM_SYMBOL_IGNORE, VM_SYMBOL_IGNORE, VM_SYMBOL_IGNORE, VM_ALIAS_IGNORE)

Thanks,
Vladimir

On 9/18/16 10:01 PM, David Holmes wrote:
> On 17/09/2016 11:17 AM, Vladimir Kozlov wrote:
>> https://bugs.openjdk.java.net/browse/JDK-8166096
>>
>> +++ b/make/lib/JvmOverrideFiles.gmk    Fri Sep 16 18:14:06 2016 -0700
>> @@ -31,6 +31,7 @@
>>
>>  ifeq ($(TOOLCHAIN_TYPE), gcc)
>>    BUILD_LIBJVM_vmStructs.cpp_CXXFLAGS := -fno-var-tracking-assignments -O0
>> +  BUILD_LIBJVM_jvmciCompilerToVM.cpp_CXXFLAGS :=
>> -fno-var-tracking-assignments
>>  endif
>>
>>  ifeq ($(OPENJDK_TARGET_OS), linux)
>>
>>
>> Remove annoying Hotspot compilation warning:
>
> Seems reasonable as a short term silencer, but ... does it imply the code needs to be changed somehow?
>
> Thanks,
> David
>
>> hotspot/src/share/vm/jvmci/jvmciCompilerToVM.cpp: In static member
>> function 'static objArrayHandle
>> CompilerToVM::initialize_intrinsics(Thread*)':
>> hotspot/src/share/vm/jvmci/jvmciCompilerToVM.cpp:206:16: note: variable
>> tracking size limit exceeded with -fvar-tracking-assignments, retrying
>> without
>>  objArrayHandle CompilerToVM::initialize_intrinsics(TRAPS) {
>>
>> Thanks,
>> Vladimir

From vladimir.kozlov at oracle.com  Mon Sep 19 17:25:12 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 19 Sep 2016 10:25:12 -0700
Subject: [9] RFR[XS] 8166096: variable tracking size limit exceeded in
	jvmciCompilerToVM.cpp
In-Reply-To: <daec6089-b9e5-cf04-2b2f-a2eea2d1b319@oracle.com>
References: <57DC99C4.4030701@oracle.com>
	<daec6089-b9e5-cf04-2b2f-a2eea2d1b319@oracle.com>
Message-ID: <07c1b43a-648c-d2f2-d1ac-61a6f6ff58fd@oracle.com>

Thank you, Erik

Vladimir

On 9/19/16 12:58 AM, Erik Joelsson wrote:
> Looks ok to me.
>
> /Erik
>
>
> On 2016-09-17 03:17, Vladimir Kozlov wrote:
>> https://bugs.openjdk.java.net/browse/JDK-8166096
>>
>> +++ b/make/lib/JvmOverrideFiles.gmk    Fri Sep 16 18:14:06 2016 -0700
>> @@ -31,6 +31,7 @@
>>
>>  ifeq ($(TOOLCHAIN_TYPE), gcc)
>>    BUILD_LIBJVM_vmStructs.cpp_CXXFLAGS := -fno-var-tracking-assignments -O0
>> +  BUILD_LIBJVM_jvmciCompilerToVM.cpp_CXXFLAGS := -fno-var-tracking-assignments
>>  endif
>>
>>  ifeq ($(OPENJDK_TARGET_OS), linux)
>>
>>
>> Remove annoying Hotspot compilation warning:
>>
>> hotspot/src/share/vm/jvmci/jvmciCompilerToVM.cpp: In static member function 'static objArrayHandle
>> CompilerToVM::initialize_intrinsics(Thread*)':
>> hotspot/src/share/vm/jvmci/jvmciCompilerToVM.cpp:206:16: note: variable tracking size limit exceeded with
>> -fvar-tracking-assignments, retrying without
>>  objArrayHandle CompilerToVM::initialize_intrinsics(TRAPS) {
>>
>> Thanks,
>> Vladimir
>

From vladimir.kozlov at oracle.com  Mon Sep 19 17:26:36 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 19 Sep 2016 10:26:36 -0700
Subject: RFR(XS) : 8166164 :
	compiler/compilercontrol/share/processors/LogProcessor.java does not
	close Scanner
In-Reply-To: <DFAD4836-5DDC-454B-99E4-9664364BF46D@oracle.com>
References: <DFAD4836-5DDC-454B-99E4-9664364BF46D@oracle.com>
Message-ID: <ffccb27f-e5e6-06f6-1ada-5a467e68daf3@oracle.com>

Good.

thanks,
Vladimir

On 9/19/16 2:38 AM, Igor Ignatyev wrote:
> http://cr.openjdk.java.net/~iignatyev/8166164/webrev.00/
>> 16 lines changed: 2 ins; 0 del; 14 mod;
>
> Hi all,
>
> could you please review this small patch which fixes resource leak in compiler/compilercontrol tests?
> LogProcessor::getScanner creates a new Scanner, but there is no code which closes it. This leak leads to 'failed to clean up files after test? error from jtreg.
>
> the fix was tested by running :hotspot_compiler test group.
>
> JBS: https://bugs.openjdk.java.net/browse/JDK-8166164
> webrev: http://cr.openjdk.java.net/~iignatyev/8166164/webrev.00/
>
> Thanks,
> ? Igor
>

From vladimir.kozlov at oracle.com  Mon Sep 19 17:32:05 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 19 Sep 2016 10:32:05 -0700
Subject: [9] RFR(S): 8166046: [TESTBUG]
	compiler/stringopts/TestStringObjectInitialization.java fails with OOME
In-Reply-To: <57DFD6C8.2080508@oracle.com>
References: <57DFD6C8.2080508@oracle.com>
Message-ID: <1391f045-7e4a-9301-0c4d-4a0083403486@oracle.com>

Should we scale down compilation threshold too? The test verifies C2 optimization we need to make sure that we still 
catch original JDK-8159244 problem.

Thanks,
Vladimir

On 9/19/16 5:15 AM, Tobias Hartmann wrote:
> Hi,
>
> please review the following patch:
> https://bugs.openjdk.java.net/browse/JDK-8166046
> http://cr.openjdk.java.net/~thartmann/8166046/webrev.00/
>
> The test creates 101 threads that each execute a loop with 10.000 iterations that append to a String another String of size 17. This results in a String of size 101 * 10.000 * 17 = 17.170.000 ( = 35 MB). In the failing cases, the test is executed on 32-bit Windows with -Xcomp and -XX:+DeoptimizeALot which increase memory consumption of the VM due to extensive (re-)compilation, deoptimization and re-allocation. The test fails because there is not enough heap space to hold the String.
>
> I reduced the number of threads to 32+1 and the number of per-thread iterations to 1000. I verified that this fixes the OOMEs on the failing 32-bit Windows machines.
>
> Thanks,
> Tobias
>

From doug.simon at oracle.com  Mon Sep 19 20:21:30 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Mon, 19 Sep 2016 22:21:30 +0200
Subject: RFR: 8165457: [JVMCI] increase InterpreterCodeSize for JVMCI
In-Reply-To: <92B9E4F8-DF56-475B-A9EC-6FB179C58925@twitter.com>
References: <39E38A4A-7DEB-49C3-BC8B-C41C9F0F0ED1@oracle.com>
	<A493D25F-BA16-43B4-8719-56419ADB294F@twitter.com>
	<7ED300F2-253B-4550-BF5E-878A99EDAEB2@oracle.com>
	<92B9E4F8-DF56-475B-A9EC-6FB179C58925@twitter.com>
Message-ID: <8A747A8B-7D76-471C-9709-4F850629F67C@oracle.com>


> On 06 Sep 2016, at 23:58, Christian Thalinger <cthalinger at twitter.com> wrote:
> 
>> 
>> On Sep 6, 2016, at 11:37 AM, Doug Simon <doug.simon at oracle.com> wrote:
>> 
>> 
>>> On 06 Sep 2016, at 20:14, Christian Thalinger <cthalinger at twitter.com> wrote:
>>> 
>>> 
>>>> On Sep 5, 2016, at 6:49 AM, Doug Simon <doug.simon at oracle.com> wrote:
>>>> 
>>>> In jvmci-8, we increased the interpreter code size when JVMCI code is included: 
>>>> 
>>>> http://hg.openjdk.java.net/graal/graal-jvmci-8/file/a074ae16281d/src/cpu/x86/vm/templateInterpreter_x86.hpp#l37 
>>> 
>>> What about SPARC?  Have we ever seen a problem there?  Or AArch64 for that matter?
>> 
>> I?ve only ever seen problems on AMD64. I?ve never seen it on SPARC and have never run on AArch64.
>> 
>> The real fix is that the interpreter generator should never have to guess the size of the code buffer it needs but should resize things as needed after generating the interpreter.
> 
> Yes, it should.

In the hope that this gets addressed one day: https://bugs.openjdk.java.net/browse/JDK-8166317

-Doug


From tobias.hartmann at oracle.com  Tue Sep 20 08:27:35 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Tue, 20 Sep 2016 10:27:35 +0200
Subject: [9] RFR(S): 8166046: [TESTBUG]
	compiler/stringopts/TestStringObjectInitialization.java fails with OOME
In-Reply-To: <1391f045-7e4a-9301-0c4d-4a0083403486@oracle.com>
References: <57DFD6C8.2080508@oracle.com>
	<1391f045-7e4a-9301-0c4d-4a0083403486@oracle.com>
Message-ID: <57E0F2F7.8000808@oracle.com>

Hi Vladimir,

On 19.09.2016 19:32, Vladimir Kozlov wrote:
> Should we scale down compilation threshold too? The test verifies C2 optimization we need to make sure that we still catch original JDK-8159244 problem.

I verified that the test still (rarely) triggers the problem I fixed with JDK-8159244 but while testing I found a better way to avoid the OOMEs:
http://cr.openjdk.java.net/~thartmann/8166046/webrev.01

This significantly reduces the runtime of the test from 1m30s to 12s on my machine and triggers JDK-8159244 in 100% of the runs. I verified that the patch still fixes the OOME's on the 32-bit Windows machines.

Thanks,
Tobias

> Thanks,
> Vladimir
> 
> On 9/19/16 5:15 AM, Tobias Hartmann wrote:
>> Hi,
>>
>> please review the following patch:
>> https://bugs.openjdk.java.net/browse/JDK-8166046
>> http://cr.openjdk.java.net/~thartmann/8166046/webrev.00/
>>
>> The test creates 101 threads that each execute a loop with 10.000 iterations that append to a String another String of size 17. This results in a String of size 101 * 10.000 * 17 = 17.170.000 ( = 35 MB). In the failing cases, the test is executed on 32-bit Windows with -Xcomp and -XX:+DeoptimizeALot which increase memory consumption of the VM due to extensive (re-)compilation, deoptimization and re-allocation. The test fails because there is not enough heap space to hold the String.
>>
>> I reduced the number of threads to 32+1 and the number of per-thread iterations to 1000. I verified that this fixes the OOMEs on the failing 32-bit Windows machines.
>>
>> Thanks,
>> Tobias
>>

From igor.ignatyev at oracle.com  Tue Sep 20 14:36:55 2016
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Tue, 20 Sep 2016 17:36:55 +0300
Subject: RFR(XS) : 8166164 :
	compiler/compilercontrol/share/processors/LogProcessor.java
	does not close Scanner
In-Reply-To: <ffccb27f-e5e6-06f6-1ada-5a467e68daf3@oracle.com>
References: <DFAD4836-5DDC-454B-99E4-9664364BF46D@oracle.com>
	<ffccb27f-e5e6-06f6-1ada-5a467e68daf3@oracle.com>
Message-ID: <EA923049-529A-423E-8E77-91B237D426D9@oracle.com>

Vladimir,

thank you for review.

? Igor

> On Sep 19, 2016, at 8:26 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
> Good.
> 
> thanks,
> Vladimir
> 
> On 9/19/16 2:38 AM, Igor Ignatyev wrote:
>> http://cr.openjdk.java.net/~iignatyev/8166164/webrev.00/
>>> 16 lines changed: 2 ins; 0 del; 14 mod;
>> 
>> Hi all,
>> 
>> could you please review this small patch which fixes resource leak in compiler/compilercontrol tests?
>> LogProcessor::getScanner creates a new Scanner, but there is no code which closes it. This leak leads to 'failed to clean up files after test? error from jtreg.
>> 
>> the fix was tested by running :hotspot_compiler test group.
>> 
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8166164
>> webrev: http://cr.openjdk.java.net/~iignatyev/8166164/webrev.00/
>> 
>> Thanks,
>> ? Igor
>> 


From igor.ignatyev at oracle.com  Tue Sep 20 14:37:14 2016
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Tue, 20 Sep 2016 17:37:14 +0300
Subject: RFR(XS) : 8166164 :
	compiler/compilercontrol/share/processors/LogProcessor.java
	does not close Scanner
In-Reply-To: <CB865A93-337C-4377-85A8-DF91876FC084@oracle.com>
References: <DFAD4836-5DDC-454B-99E4-9664364BF46D@oracle.com>
	<CB865A93-337C-4377-85A8-DF91876FC084@oracle.com>
Message-ID: <427682F7-2B61-464A-987D-C8BEC6FEACC4@oracle.com>

Pavel,

thanks for review.

? Igor
> On Sep 19, 2016, at 3:18 PM, Pavel Punegov <pavel.punegov at oracle.com> wrote:
> 
> Hi Igor,
> 
> the change looks good. Thanks for fixing.
> 
> ? Pavel.
> 
>> On 19 Sep 2016, at 12:38, Igor Ignatyev <igor.ignatyev at oracle.com> wrote:
>> 
>> http://cr.openjdk.java.net/~iignatyev/8166164/webrev.00/
>>> 16 lines changed: 2 ins; 0 del; 14 mod;
>> 
>> Hi all,
>> 
>> could you please review this small patch which fixes resource leak in compiler/compilercontrol tests? 
>> LogProcessor::getScanner creates a new Scanner, but there is no code which closes it. This leak leads to 'failed to clean up files after test? error from jtreg.
>> 
>> the fix was tested by running :hotspot_compiler test group.
>> 
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8166164
>> webrev: http://cr.openjdk.java.net/~iignatyev/8166164/webrev.00/
>> 
>> Thanks,
>> ? Igor
>> 
> 


From vladimir.kozlov at oracle.com  Tue Sep 20 16:58:12 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 20 Sep 2016 09:58:12 -0700
Subject: [9] RFR(S): 8166046: [TESTBUG]
	compiler/stringopts/TestStringObjectInitialization.java fails with OOME
In-Reply-To: <57E0F2F7.8000808@oracle.com>
References: <57DFD6C8.2080508@oracle.com>
	<1391f045-7e4a-9301-0c4d-4a0083403486@oracle.com>
	<57E0F2F7.8000808@oracle.com>
Message-ID: <e05e296e-ff92-8bf2-f310-88eedaf96345@oracle.com>

Perfect.

thanks,
Vladimir

On 9/20/16 1:27 AM, Tobias Hartmann wrote:
> Hi Vladimir,
>
> On 19.09.2016 19:32, Vladimir Kozlov wrote:
>> Should we scale down compilation threshold too? The test verifies C2 optimization we need to make sure that we still catch original JDK-8159244 problem.
>
> I verified that the test still (rarely) triggers the problem I fixed with JDK-8159244 but while testing I found a better way to avoid the OOMEs:
> http://cr.openjdk.java.net/~thartmann/8166046/webrev.01
>
> This significantly reduces the runtime of the test from 1m30s to 12s on my machine and triggers JDK-8159244 in 100% of the runs. I verified that the patch still fixes the OOME's on the 32-bit Windows machines.
>
> Thanks,
> Tobias
>
>> Thanks,
>> Vladimir
>>
>> On 9/19/16 5:15 AM, Tobias Hartmann wrote:
>>> Hi,
>>>
>>> please review the following patch:
>>> https://bugs.openjdk.java.net/browse/JDK-8166046
>>> http://cr.openjdk.java.net/~thartmann/8166046/webrev.00/
>>>
>>> The test creates 101 threads that each execute a loop with 10.000 iterations that append to a String another String of size 17. This results in a String of size 101 * 10.000 * 17 = 17.170.000 ( = 35 MB). In the failing cases, the test is executed on 32-bit Windows with -Xcomp and -XX:+DeoptimizeALot which increase memory consumption of the VM due to extensive (re-)compilation, deoptimization and re-allocation. The test fails because there is not enough heap space to hold the String.
>>>
>>> I reduced the number of threads to 32+1 and the number of per-thread iterations to 1000. I verified that this fixes the OOMEs on the failing 32-bit Windows machines.
>>>
>>> Thanks,
>>> Tobias
>>>

From tobias.hartmann at oracle.com  Tue Sep 20 17:01:28 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Tue, 20 Sep 2016 19:01:28 +0200
Subject: [9] RFR(S): 8166046: [TESTBUG]
	compiler/stringopts/TestStringObjectInitialization.java fails with OOME
In-Reply-To: <e05e296e-ff92-8bf2-f310-88eedaf96345@oracle.com>
References: <57DFD6C8.2080508@oracle.com>
	<1391f045-7e4a-9301-0c4d-4a0083403486@oracle.com>
	<57E0F2F7.8000808@oracle.com>
	<e05e296e-ff92-8bf2-f310-88eedaf96345@oracle.com>
Message-ID: <57E16B68.1040001@oracle.com>

Thanks, Vladimir.

Best regards,
Tobias

On 20.09.2016 18:58, Vladimir Kozlov wrote:
> Perfect.
> 
> thanks,
> Vladimir
> 
> On 9/20/16 1:27 AM, Tobias Hartmann wrote:
>> Hi Vladimir,
>>
>> On 19.09.2016 19:32, Vladimir Kozlov wrote:
>>> Should we scale down compilation threshold too? The test verifies C2 optimization we need to make sure that we still catch original JDK-8159244 problem.
>>
>> I verified that the test still (rarely) triggers the problem I fixed with JDK-8159244 but while testing I found a better way to avoid the OOMEs:
>> http://cr.openjdk.java.net/~thartmann/8166046/webrev.01
>>
>> This significantly reduces the runtime of the test from 1m30s to 12s on my machine and triggers JDK-8159244 in 100% of the runs. I verified that the patch still fixes the OOME's on the 32-bit Windows machines.
>>
>> Thanks,
>> Tobias
>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 9/19/16 5:15 AM, Tobias Hartmann wrote:
>>>> Hi,
>>>>
>>>> please review the following patch:
>>>> https://bugs.openjdk.java.net/browse/JDK-8166046
>>>> http://cr.openjdk.java.net/~thartmann/8166046/webrev.00/
>>>>
>>>> The test creates 101 threads that each execute a loop with 10.000 iterations that append to a String another String of size 17. This results in a String of size 101 * 10.000 * 17 = 17.170.000 ( = 35 MB). In the failing cases, the test is executed on 32-bit Windows with -Xcomp and -XX:+DeoptimizeALot which increase memory consumption of the VM due to extensive (re-)compilation, deoptimization and re-allocation. The test fails because there is not enough heap space to hold the String.
>>>>
>>>> I reduced the number of threads to 32+1 and the number of per-thread iterations to 1000. I verified that this fixes the OOMEs on the failing 32-bit Windows machines.
>>>>
>>>> Thanks,
>>>> Tobias
>>>>

From HORII at jp.ibm.com  Tue Sep 20 17:03:19 2016
From: HORII at jp.ibm.com (Hiroshi H Horii)
Date: Wed, 21 Sep 2016 02:03:19 +0900
Subject: RFR(m) 8164920: ppc: enhancement of CRC32 intrinsic
In-Reply-To: <OF069946E4.2F8AAC40-ON49258032.006041F6-49258032.0060B791@LocalDomain>
References: <OF3DBB7B4B.62D23BD5-ON49258026.0031B271-49258026.0051807D@notes.na.collabserv.com>
	<fdca33bc870f47e89768ef6cd96aba7f@DEWDFE13DE14.global.corp.sap>
	<OF069946E4.2F8AAC40-ON49258032.006041F6-49258032.0060B791@LocalDomain>
Message-ID: <OF8FF66E87.7F5BDE86-ON49258034.005D75B1-49258034.005DAFF6@notes.na.collabserv.com>

Hi all,

Martin thankfully created a webrev with some good correction.
http://cr.openjdk.java.net/~mdoerr/8164920_ppc_crc32/webrev.01/

Could someone review this change again?

Regards,
Hiroshi
-----------------------
Hiroshi Horii, Ph.D.
IBM Research - Tokyo


From:   Hiroshi H Horii/Japan/IBM
To:     "Doerr, Martin" <martin.doerr at sap.com>
Cc:     Gustavo Bueno Romero <gromero at br.ibm.com>, 
"hotspot-compiler-dev at openjdk.java.net" 
<hotspot-compiler-dev at openjdk.java.net>, "Volker Simonis 
(volker.simonis at gmail.com)" <volker.simonis at gmail.com>
Date:   09/19/2016 02:36
Subject:        RE: RFR(m) 8164920: ppc: enhancement of CRC32 intrinsic


Hi Martin, and all
(Please allow me to send this mail twice. The first mail is awaiting 
because it exceeded 100KB)

Thank you for your reviewing. Gustavo and I recreated a new change based 
on your comments. I would like to request a review again.

My account of cr server is not available now (because of my mistake...) 
and Gustavo cannot create a webrev file with another reason. I would like 
to attach a diff file created with "hg diff -g" in hotspot.  If possible, 
could someone create a webrev file with this changeset?

[attachment "hotspot.crc32.20160918.changeset" deleted by Hiroshi H 
Horii/Japan/IBM] 

Regards,
Hiroshi
-----------------------
Hiroshi Horii, Ph.D.
IBM Research - Tokyo

"Doerr, Martin" <martin.doerr at sap.com> wrote on 09/13/2016 18:35:09:

> From: "Doerr, Martin" <martin.doerr at sap.com>
> To: Hiroshi H Horii/Japan/IBM at IBMJP, "hotspot-compiler-
> dev at openjdk.java.net" <hotspot-compiler-dev at openjdk.java.net>
> Cc: "Volker Simonis (volker.simonis at gmail.com)" 
> <volker.simonis at gmail.com>, Gustavo Bueno Romero <gromero at br.ibm.com>
> Date: 09/13/2016 18:36
> Subject: RE: RFR(m) 8164920: ppc: enhancement of CRC32 intrinsic
> 
> Hi Hiroshi,
> 
> we appreciate your change. Thanks for contributing it.
> It basically looks good, but I?d like to propose some minor 
improvements.
> 
> 
> kernel_crc32_1word_vpmsumd:
> 
> 1. The Pre-align code can be implemented shorter:
>     clrldi_(prealign, buf, 57);
>     beq(CCR0, L_alignHead);
> 
>     subfic(prealign, prealign, 128);
> 
> 2. I'd prefer the label name ?L_alignedHead?.
> 
> 3. The branch b(L_alignTail) and the label are not needed and should
> get removed.
> 
> 
> kernel_crc32_1word_aligned:
> 
> 1. When saving and restoring non-volatile vector register, please 
> use offset differences of -16 instead of -32.
> (The ABI allows up to 288 bytes to be used in frameless functions so
> it will fit if -16 is used.)
> 
> 2. The std instructions should better be used with int offsets so 
> you can get rid of the addi(offset, offset, -8) instructions.
> 
> 
> Comments:
> For single line comments "//" should be used instead of "/*". Would 
> be nice if you could change them.
> 
> 
> Thanks and best regards,
> Martin
> 
> 
> From: Hiroshi H Horii [mailto:HORII at jp.ibm.com] 
> Sent: Dienstag, 6. September 2016 16:50
> To: hotspot-compiler-dev at openjdk.java.net; vladimir.kozlov at oracle.com
> Cc: Volker Simonis (volker.simonis at gmail.com) 
> <volker.simonis at gmail.com>; Doerr, Martin <martin.doerr at sap.com>; 
> Gustavo Bueno Romero <gromero at br.ibm.com>
> Subject: RFR(m) 8164920: ppc: enhancement of CRC32 intrinsic
> 
> Dear Vladimir and all:
> 
> Can I please request reviews for the following change?
> 
> JIRA: https://bugs.openjdk.java.net/browse/JDK-8164920
> webrev: http://cr.openjdk.java.net/~gromero/8164920/01/
> 
> As Volker's comments in the above JIRA, this is a ppc64-only 
> improvement which will not 
> affect any of the Oracle platforms in any way. 
> 
> This change includes new implementation of CRC32 Intrinsics for ppc64le.
> In my local experiment, CRC32 of 64KB was calculated more than 20 
> times faster than original. 
> Performance of CRC32 Intrinsic is important to run recent Apache 
Cassandra.
> A Cassandra daemon needs to read 64KB data from a disk with CRC32 
> checksum by default.
> 
> This JIRA entry has "jdk9-fc-request" label.
> If there is a chance to include new change in JDK 9 for ppc64le, I 
> would like to request 
> a review for this change.
> 
> Regards,
> Hiroshi
> -----------------------
> Hiroshi Horii, Ph.D.
> IBM Research - Tokyo


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160921/dc9c53b2/attachment-0001.html>

From cthalinger at twitter.com  Tue Sep 20 17:59:30 2016
From: cthalinger at twitter.com (Christian Thalinger)
Date: Tue, 20 Sep 2016 07:59:30 -1000
Subject: RFR: 8165457: [JVMCI] increase InterpreterCodeSize for JVMCI
In-Reply-To: <8A747A8B-7D76-471C-9709-4F850629F67C@oracle.com>
References: <39E38A4A-7DEB-49C3-BC8B-C41C9F0F0ED1@oracle.com>
	<A493D25F-BA16-43B4-8719-56419ADB294F@twitter.com>
	<7ED300F2-253B-4550-BF5E-878A99EDAEB2@oracle.com>
	<92B9E4F8-DF56-475B-A9EC-6FB179C58925@twitter.com>
	<8A747A8B-7D76-471C-9709-4F850629F67C@oracle.com>
Message-ID: <B73B1246-7B09-491F-B2D0-C7F64D340B94@twitter.com>


> On Sep 19, 2016, at 10:21 AM, Doug Simon <doug.simon at oracle.com> wrote:
> 
> 
>> On 06 Sep 2016, at 23:58, Christian Thalinger <cthalinger at twitter.com> wrote:
>> 
>>> 
>>> On Sep 6, 2016, at 11:37 AM, Doug Simon <doug.simon at oracle.com> wrote:
>>> 
>>> 
>>>> On 06 Sep 2016, at 20:14, Christian Thalinger <cthalinger at twitter.com> wrote:
>>>> 
>>>> 
>>>>> On Sep 5, 2016, at 6:49 AM, Doug Simon <doug.simon at oracle.com> wrote:
>>>>> 
>>>>> In jvmci-8, we increased the interpreter code size when JVMCI code is included: 
>>>>> 
>>>>> http://hg.openjdk.java.net/graal/graal-jvmci-8/file/a074ae16281d/src/cpu/x86/vm/templateInterpreter_x86.hpp#l37 
>>>> 
>>>> What about SPARC?  Have we ever seen a problem there?  Or AArch64 for that matter?
>>> 
>>> I?ve only ever seen problems on AMD64. I?ve never seen it on SPARC and have never run on AArch64.
>>> 
>>> The real fix is that the interpreter generator should never have to guess the size of the code buffer it needs but should resize things as needed after generating the interpreter.
>> 
>> Yes, it should.
> 
> In the hope that this gets addressed one day: https://bugs.openjdk.java.net/browse/JDK-8166317 <https://bugs.openjdk.java.net/browse/JDK-8166317>

??
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160920/404a73c5/attachment.html>

From michael.c.berg at intel.com  Tue Sep 20 23:02:58 2016
From: michael.c.berg at intel.com (Berg, Michael C)
Date: Tue, 20 Sep 2016 23:02:58 +0000
Subject: CR for RFR 8129376
Message-ID: <C568518E7B433348B114B6A7122D474756F192FE@FMSMSX102.amr.corp.intel.com>

Hi Folks,

Performance on client x86 targets was hampered in two SPECjvm98 metrics (mpegaudio and  mtrt) for 32-bit since we added AVX512.  I also checked to make sure only client on x86-32 was affected.  I have mitigated this by altering the xmm pad modeling in the x86-32-bit machine description so that register allocation cannot see the dummy definitions, enabling the desired performance while retaining correctness for 32-bit on AVX512.

This code was tested as follows: hotspot jreg, SPECjvm2008, SPECjvm98 on hsw, skx and knl targets complete with no issues on 32-bit.  These changes do not alter behavior on x86-64.

Bug-id: https://bugs.openjdk.java.net/browse/JDK-8129376

webrev:
http://cr.openjdk.java.net/~mcberg/8129376/webrev.01

Regards,
Michael
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160920/3a868d27/attachment.html>

From vladimir.kozlov at oracle.com  Tue Sep 20 23:15:52 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 20 Sep 2016 16:15:52 -0700
Subject: CR for RFR 8129376
In-Reply-To: <C568518E7B433348B114B6A7122D474756F192FE@FMSMSX102.amr.corp.intel.com>
References: <C568518E7B433348B114B6A7122D474756F192FE@FMSMSX102.amr.corp.intel.com>
Message-ID: <57E1C328.4020402@oracle.com>

Changes look good.

Michael, can you explain how pads affected code generation to cause regression?
.ad changes affects Server VM (c2) code generation. Do you need it based on performance numbers? TRy running Server VM with -XX:-TieredCompilation.

Thanks,
Vladimir

On 9/20/16 4:02 PM, Berg, Michael C wrote:
> Hi Folks,
>
> Performance on client x86 targets was hampered in two SPECjvm98 metrics (mpegaudio and  mtrt) for 32-bit since we added AVX512.  I also checked to make sure only client on x86-32 was affected.  I have
> mitigated this by altering the xmm pad modeling in the x86-32-bit machine description so that register allocation cannot see the dummy definitions, enabling the desired performance while retaining
> correctness for 32-bit on AVX512.
>
> This code was tested as follows: hotspot jreg, SPECjvm2008, SPECjvm98 on hsw, skx and knl targets complete with no issues on 32-bit.  These changes do not alter behavior on x86-64.
>
>
> Bug-id: https://bugs.openjdk.java.net/browse/JDK-8129376
>
>
> webrev:
>
> http://cr.openjdk.java.net/~mcberg/8129376/webrev.01
>
> Regards,
>
> Michael
>

From michael.c.berg at intel.com  Tue Sep 20 23:22:09 2016
From: michael.c.berg at intel.com (Berg, Michael C)
Date: Tue, 20 Sep 2016 23:22:09 +0000
Subject: CR for RFR 8129376
In-Reply-To: <57E1C328.4020402@oracle.com>
References: <C568518E7B433348B114B6A7122D474756F192FE@FMSMSX102.amr.corp.intel.com>
	<57E1C328.4020402@oracle.com>
Message-ID: <C568518E7B433348B114B6A7122D474756F1934B@FMSMSX102.amr.corp.intel.com>

Vladmir,

The way they were versed they caused allocation issues on part of the xmm bank for client only. 
I believe I ran with both tiered off and inlining off while sleuthing the issue and after I applied the change as part of my verification process.  I tested client and server 32-bit.
With the change applied 32-bit server performance does not seem affected.  The generated code looks like it did before the change was applied now.

Regards,
Michael

-----Original Message-----
From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] 
Sent: Tuesday, September 20, 2016 4:16 PM
To: hotspot-compiler-dev at openjdk.java.net
Cc: Berg, Michael C <michael.c.berg at intel.com>
Subject: Re: CR for RFR 8129376

Changes look good.

Michael, can you explain how pads affected code generation to cause regression?
.ad changes affects Server VM (c2) code generation. Do you need it based on performance numbers? TRy running Server VM with -XX:-TieredCompilation.

Thanks,
Vladimir

On 9/20/16 4:02 PM, Berg, Michael C wrote:
> Hi Folks,
>
> Performance on client x86 targets was hampered in two SPECjvm98 
> metrics (mpegaudio and  mtrt) for 32-bit since we added AVX512.  I 
> also checked to make sure only client on x86-32 was affected.  I have mitigated this by altering the xmm pad modeling in the x86-32-bit machine description so that register allocation cannot see the dummy definitions, enabling the desired performance while retaining correctness for 32-bit on AVX512.
>
> This code was tested as follows: hotspot jreg, SPECjvm2008, SPECjvm98 on hsw, skx and knl targets complete with no issues on 32-bit.  These changes do not alter behavior on x86-64.
>
>
> Bug-id: https://bugs.openjdk.java.net/browse/JDK-8129376
>
>
> webrev:
>
> http://cr.openjdk.java.net/~mcberg/8129376/webrev.01
>
> Regards,
>
> Michael
>

From michael.c.berg at intel.com  Tue Sep 20 23:24:19 2016
From: michael.c.berg at intel.com (Berg, Michael C)
Date: Tue, 20 Sep 2016 23:24:19 +0000
Subject: CR for RFR 8129376
References: <C568518E7B433348B114B6A7122D474756F192FE@FMSMSX102.amr.corp.intel.com>
	<57E1C328.4020402@oracle.com> 
Message-ID: <C568518E7B433348B114B6A7122D474756F19366@FMSMSX102.amr.corp.intel.com>

Small augment...

-----Original Message-----
From: Berg, Michael C 
Sent: Tuesday, September 20, 2016 4:22 PM
To: 'Vladimir Kozlov' <vladimir.kozlov at oracle.com>; hotspot-compiler-dev at openjdk.java.net
Subject: RE: CR for RFR 8129376

Vladmir,

The way they were versed they caused allocation issues on part of the xmm bank for client only. 
I believe I ran with both tiered off and inlining off while sleuthing the issue and after I applied the change as part of my verification process.  I tested client and server 32-bit.
With the change applied 32-bit server performance does not seem affected.  The generated code looks like it did before the <AVX512> change was applied now.

Regards,
Michael

-----Original Message-----
From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
Sent: Tuesday, September 20, 2016 4:16 PM
To: hotspot-compiler-dev at openjdk.java.net
Cc: Berg, Michael C <michael.c.berg at intel.com>
Subject: Re: CR for RFR 8129376

Changes look good.

Michael, can you explain how pads affected code generation to cause regression?
.ad changes affects Server VM (c2) code generation. Do you need it based on performance numbers? TRy running Server VM with -XX:-TieredCompilation.

Thanks,
Vladimir

On 9/20/16 4:02 PM, Berg, Michael C wrote:
> Hi Folks,
>
> Performance on client x86 targets was hampered in two SPECjvm98 
> metrics (mpegaudio and  mtrt) for 32-bit since we added AVX512.  I 
> also checked to make sure only client on x86-32 was affected.  I have mitigated this by altering the xmm pad modeling in the x86-32-bit machine description so that register allocation cannot see the dummy definitions, enabling the desired performance while retaining correctness for 32-bit on AVX512.
>
> This code was tested as follows: hotspot jreg, SPECjvm2008, SPECjvm98 on hsw, skx and knl targets complete with no issues on 32-bit.  These changes do not alter behavior on x86-64.
>
>
> Bug-id: https://bugs.openjdk.java.net/browse/JDK-8129376
>
>
> webrev:
>
> http://cr.openjdk.java.net/~mcberg/8129376/webrev.01
>
> Regards,
>
> Michael
>

From vladimir.kozlov at oracle.com  Tue Sep 20 23:40:21 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 20 Sep 2016 16:40:21 -0700
Subject: CR for RFR 8129376
In-Reply-To: <C568518E7B433348B114B6A7122D474756F19366@FMSMSX102.amr.corp.intel.com>
References: <C568518E7B433348B114B6A7122D474756F192FE@FMSMSX102.amr.corp.intel.com>
	<57E1C328.4020402@oracle.com>
	<C568518E7B433348B114B6A7122D474756F19366@FMSMSX102.amr.corp.intel.com>
Message-ID: <57E1C8E5.8060808@oracle.com>

Okay, goods then. I will sponsor it.

Thanks,
Vladimir

On 9/20/16 4:24 PM, Berg, Michael C wrote:
> Small augment...
>
> -----Original Message-----
> From: Berg, Michael C
> Sent: Tuesday, September 20, 2016 4:22 PM
> To: 'Vladimir Kozlov' <vladimir.kozlov at oracle.com>; hotspot-compiler-dev at openjdk.java.net
> Subject: RE: CR for RFR 8129376
>
> Vladmir,
>
> The way they were versed they caused allocation issues on part of the xmm bank for client only.
> I believe I ran with both tiered off and inlining off while sleuthing the issue and after I applied the change as part of my verification process.  I tested client and server 32-bit.
> With the change applied 32-bit server performance does not seem affected.  The generated code looks like it did before the <AVX512> change was applied now.
>
> Regards,
> Michael
>
> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Tuesday, September 20, 2016 4:16 PM
> To: hotspot-compiler-dev at openjdk.java.net
> Cc: Berg, Michael C <michael.c.berg at intel.com>
> Subject: Re: CR for RFR 8129376
>
> Changes look good.
>
> Michael, can you explain how pads affected code generation to cause regression?
> .ad changes affects Server VM (c2) code generation. Do you need it based on performance numbers? TRy running Server VM with -XX:-TieredCompilation.
>
> Thanks,
> Vladimir
>
> On 9/20/16 4:02 PM, Berg, Michael C wrote:
>> Hi Folks,
>>
>> Performance on client x86 targets was hampered in two SPECjvm98
>> metrics (mpegaudio and  mtrt) for 32-bit since we added AVX512.  I
>> also checked to make sure only client on x86-32 was affected.  I have mitigated this by altering the xmm pad modeling in the x86-32-bit machine description so that register allocation cannot see the dummy definitions, enabling the desired performance while retaining correctness for 32-bit on AVX512.
>>
>> This code was tested as follows: hotspot jreg, SPECjvm2008, SPECjvm98 on hsw, skx and knl targets complete with no issues on 32-bit.  These changes do not alter behavior on x86-64.
>>
>>
>> Bug-id: https://bugs.openjdk.java.net/browse/JDK-8129376
>>
>>
>> webrev:
>>
>> http://cr.openjdk.java.net/~mcberg/8129376/webrev.01
>>
>> Regards,
>>
>> Michael
>>

From tobias.hartmann at oracle.com  Wed Sep 21 07:01:42 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Wed, 21 Sep 2016 09:01:42 +0200
Subject: [9] RFR(S): 8161085: PreserveFPRegistersTest fails with
	'AssertionError: Final value has changed'
Message-ID: <57E23056.50200@oracle.com>

Hi,

please review the following patch:
https://bugs.openjdk.java.net/browse/JDK-8161085
http://cr.openjdk.java.net/~thartmann/8161085/webrev.00/

This problem is very similar to JDK-8148175, the test fails because G1 barriers emitted by C1 do not preserve floating point registers on SPARC. The problem is that the barrier code calls into the runtime without saving/restoring the FP registers. I fixed this by using save_live_registers() instead of manually saving individual registers.

Tested with failing regression test on SPARC and RBT (running).

Thanks,
Tobias

From vladimir.x.ivanov at oracle.com  Wed Sep 21 10:07:43 2016
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Wed, 21 Sep 2016 13:07:43 +0300
Subject: [9] RFR(S): 8161085: PreserveFPRegistersTest fails with
	'AssertionError: Final value has changed'
In-Reply-To: <57E23056.50200@oracle.com>
References: <57E23056.50200@oracle.com>
Message-ID: <a1c85c4f-f7c4-3de7-7bc5-2f1f0cbb85c1@oracle.com>

Looks good.

Best regards,
Vladimir Ivanov

On 9/21/16 10:01 AM, Tobias Hartmann wrote:
> Hi,
>
> please review the following patch:
> https://bugs.openjdk.java.net/browse/JDK-8161085
> http://cr.openjdk.java.net/~thartmann/8161085/webrev.00/
>
> This problem is very similar to JDK-8148175, the test fails because G1 barriers emitted by C1 do not preserve floating point registers on SPARC. The problem is that the barrier code calls into the runtime without saving/restoring the FP registers. I fixed this by using save_live_registers() instead of manually saving individual registers.
>
> Tested with failing regression test on SPARC and RBT (running).
>
> Thanks,
> Tobias
>

From tobias.hartmann at oracle.com  Wed Sep 21 10:08:11 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Wed, 21 Sep 2016 12:08:11 +0200
Subject: [9] RFR(S): 8161085: PreserveFPRegistersTest fails with
	'AssertionError: Final value has changed'
In-Reply-To: <a1c85c4f-f7c4-3de7-7bc5-2f1f0cbb85c1@oracle.com>
References: <57E23056.50200@oracle.com>
	<a1c85c4f-f7c4-3de7-7bc5-2f1f0cbb85c1@oracle.com>
Message-ID: <57E25C0B.8090000@oracle.com>

Thanks, Vladimir!

Best regards,
Tobias

On 21.09.2016 12:07, Vladimir Ivanov wrote:
> Looks good.
> 
> Best regards,
> Vladimir Ivanov
> 
> On 9/21/16 10:01 AM, Tobias Hartmann wrote:
>> Hi,
>>
>> please review the following patch:
>> https://bugs.openjdk.java.net/browse/JDK-8161085
>> http://cr.openjdk.java.net/~thartmann/8161085/webrev.00/
>>
>> This problem is very similar to JDK-8148175, the test fails because G1 barriers emitted by C1 do not preserve floating point registers on SPARC. The problem is that the barrier code calls into the runtime without saving/restoring the FP registers. I fixed this by using save_live_registers() instead of manually saving individual registers.
>>
>> Tested with failing regression test on SPARC and RBT (running).
>>
>> Thanks,
>> Tobias
>>

From vladimir.kozlov at oracle.com  Wed Sep 21 16:02:05 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 21 Sep 2016 09:02:05 -0700
Subject: [9] RFR(S): 8161085: PreserveFPRegistersTest fails with
	'AssertionError: Final value has changed'
In-Reply-To: <57E23056.50200@oracle.com>
References: <57E23056.50200@oracle.com>
Message-ID: <8add0ad3-f4bf-c81d-823e-97fe509b2ef3@oracle.com>

Good.

Thanks,
Vladimir

On 9/21/16 12:01 AM, Tobias Hartmann wrote:
> Hi,
>
> please review the following patch:
> https://bugs.openjdk.java.net/browse/JDK-8161085
> http://cr.openjdk.java.net/~thartmann/8161085/webrev.00/
>
> This problem is very similar to JDK-8148175, the test fails because G1 barriers emitted by C1 do not preserve floating point registers on SPARC. The problem is that the barrier code calls into the runtime without saving/restoring the FP registers. I fixed this by using save_live_registers() instead of manually saving individual registers.
>
> Tested with failing regression test on SPARC and RBT (running).
>
> Thanks,
> Tobias
>

From tobias.hartmann at oracle.com  Wed Sep 21 16:07:47 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Wed, 21 Sep 2016 18:07:47 +0200
Subject: [9] RFR(S): 8161085: PreserveFPRegistersTest fails with
	'AssertionError: Final value has changed'
In-Reply-To: <8add0ad3-f4bf-c81d-823e-97fe509b2ef3@oracle.com>
References: <57E23056.50200@oracle.com>
	<8add0ad3-f4bf-c81d-823e-97fe509b2ef3@oracle.com>
Message-ID: <57E2B053.8020904@oracle.com>

Thanks, Vladimir!

Best regards,
Tobias

On 21.09.2016 18:02, Vladimir Kozlov wrote:
> Good.
> 
> Thanks,
> Vladimir
> 
> On 9/21/16 12:01 AM, Tobias Hartmann wrote:
>> Hi,
>>
>> please review the following patch:
>> https://bugs.openjdk.java.net/browse/JDK-8161085
>> http://cr.openjdk.java.net/~thartmann/8161085/webrev.00/
>>
>> This problem is very similar to JDK-8148175, the test fails because G1 barriers emitted by C1 do not preserve floating point registers on SPARC. The problem is that the barrier code calls into the runtime without saving/restoring the FP registers. I fixed this by using save_live_registers() instead of manually saving individual registers.
>>
>> Tested with failing regression test on SPARC and RBT (running).
>>
>> Thanks,
>> Tobias
>>

From vladimir.kozlov at oracle.com  Wed Sep 21 22:51:33 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 21 Sep 2016 15:51:33 -0700
Subject: RFR(M) 8078122 : YMM registers upper 128 bits may get clobbered
	by a JNI call on windows
In-Reply-To: <F89640DCD01A85489FCBA68183A6A0F38EB96FE4@ORSMSX116.amr.corp.intel.com>
References: <F89640DCD01A85489FCBA68183A6A0F38EB8A9A7@ORSMSX116.amr.corp.intel.com>
	<57BE1AD4.7070403@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F38EB8AC2E@ORSMSX116.amr.corp.intel.com>
	<F89640DCD01A85489FCBA68183A6A0F38EB91338@ORSMSX116.amr.corp.intel.com>
	<6aee0e7c-76a5-a920-7099-a3edc349f205@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F38EB92A1C@ORSMSX116.amr.corp.intel.com>
	<4af19c5d-9a7f-d18b-820b-6f3664b8183a@oracle.com>
	<7de8489c-943b-5ecf-48c1-0bffad101070@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F38EB957C3@ORSMSX116.amr.corp.intel.com>
	<F89640DCD01A85489FCBA68183A6A0F38EB96FE4@ORSMSX116.amr.corp.intel.com>
Message-ID: <57E30EF5.8010709@oracle.com>

To close loop on this.

It looks like the machine, on which test failed, had well known XMM saving problem in Linux kernal.
So we decided to push changes. You may saw notification already.

regards,
Vladimir

On 9/8/16 5:46 PM, Kharbas, Kishor wrote:
> Hi Vladimir,
> I couldn't reproduce the error on my 32-bit Linux machine. The test was done on a Sandy bridge machine (has AVX instruction set)
> Please advise how to proceed further.
>
> Thanks
> Kishor
>
>
> -----Original Message-----
> From: Kharbas, Kishor
> Sent: Tuesday, September 6, 2016 5:40 PM
> To: Vladimir Kozlov <vladimir.kozlov at oracle.com>; hotspot-compiler-dev at openjdk.java.net
> Cc: Kharbas, Kishor <kishor.kharbas at intel.com>
> Subject: RE: RFR(M) 8078122 : YMM registers upper 128 bits may get clobbered by a JNI call on windows
>
> Hi Vladimir,
>
> The patch only touches code in _WIN64. I am having hard time to understand why the test fails for 32-bit Linux
>
> Btw, that test passes on Windows 64 platform. I am planning to test on Linux too.
>
> Thanks
> Kishor
>
> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Tuesday, September 6, 2016 2:31 PM
> To: Kharbas, Kishor <kishor.kharbas at intel.com>; hotspot-compiler-dev at openjdk.java.net
> Subject: Re: RFR(M) 8078122 : YMM registers upper 128 bits may get clobbered by a JNI call on windows
>
> Next jtreg test failed on 32-bit Linux:
>
> hotspot/test/compiler/runtime/Test7196199.java
>
> ----------System.err:(57/2416)----------
> test_incrc: [41] = 8.081506E20 != 150000.0
> test_incrc: [42] = 1.8632992E31 != 150000.0
> test_incrc: [43] = 2.8397877E29 != 150000.0 ...
>
> https://bugs.openjdk.java.net/browse/JDK-7196199
>
> was related to Upper bits (64-255) of XMM (YMM) registers are not saved/restored in interrupt handle code during safepoint.
>
> Looks like your changes are not enough.
>
> Vladimir
>
>
> On 9/6/16 10:12 AM, Vladimir Kozlov wrote:
>> Good. I start testing these changes. I will push it if testing pass.
>>
>> Thanks,
>> Vladimir
>>
>> On 9/2/16 3:07 PM, Kharbas, Kishor wrote:
>>> Thanks Vladimir,
>>>
>>> I have updated the patch :
>>> http://cr.openjdk.java.net/~vdeshpande/8078122/webrev.02/
>>>
>>> I looked for other places in src/cpu/x86/vm. I feel every case is
>>> covered.
>>>
>>> - Kishor
>>>
>>> -----Original Message-----
>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>> Sent: Thursday, September 1, 2016 11:39 AM
>>> To: Kharbas, Kishor <kishor.kharbas at intel.com>;
>>> hotspot-compiler-dev at openjdk.java.net
>>> Subject: Re: RFR(M) 8078122 : YMM registers upper 128 bits may get
>>> clobbered by a JNI call on windows
>>>
>>> Good. But looks like some code relied on old stack layout in stubs,
>>> for example sha256_AVX2():
>>>
>>> #ifndef _WIN64
>>>     _XMM_SAVE_SIZE = 0,
>>> #else
>>>     _XMM_SAVE_SIZE = 8*16,
>>> #endif
>>>
>>> Please, check that all other related code is fixed too. (I looked on
>>> all cases of _WIN64 in src/cpu/x86/vm/).
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 8/31/16 10:17 PM, Kharbas, Kishor wrote:
>>>> Hello,
>>>>
>>>> I removed the unwanted save and restore of registers in the range
>>>> XMM6-XMM31 from the x64_64 stubs.
>>>> I also removed the #ifdef _WIN64 block from x86.ad file.
>>>>
>>>> Link to the new patch :
>>>> http://cr.openjdk.java.net/~vdeshpande/8078122/webrev.01/
>>>>
>>>> Thanks
>>>> Kishor
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Kharbas, Kishor
>>>> Sent: Wednesday, August 24, 2016 6:24 PM
>>>> To: Vladimir Kozlov <vladimir.kozlov at oracle.com>;
>>>> hotspot-compiler-dev at openjdk.java.net
>>>> Cc: Kharbas, Kishor <kishor.kharbas at intel.com>
>>>> Subject: RE: RFR(M) 8078122 : YMM registers upper 128 bits may get
>>>> clobbered by a JNI call on windows
>>>>
>>>> Thanks Vladimir for quick feedback.
>>>> I will look into the stubs which save the registers in the range
>>>> XMM6-XMM31. Also the first comment makes perfect sense.
>>>>
>>>> Thanks
>>>> Kishor
>>>>
>>>> -----Original Message-----
>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>>> Sent: Wednesday, August 24, 2016 3:08 PM
>>>> To: Kharbas, Kishor <kishor.kharbas at intel.com>;
>>>> hotspot-compiler-dev at openjdk.java.net
>>>> Subject: Re: RFR(M) 8078122 : YMM registers upper 128 bits may get
>>>> clobbered by a JNI call on windows
>>>>
>>>> Hi Kishor,
>>>>
>>>> First, #ifdef _WIN64 is not needed anymore since calling convention
>>>> is similat to unix now.
>>>>
>>>> Second, I would like you to look more broadly. With this change we
>>>> don't need to preserve XMM6-XMM31 in our stubs for WIN64. I am not
>>>> sure that we can remove all #ifdef _WIN64 there but for most of them
>>>> I think we can do. Please, look.
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 8/24/16 2:40 PM, Kharbas, Kishor wrote:
>>>>> Requesting the community to review the patch for
>>>>> https://bugs.openjdk.java.net/browse/JDK-8078122
>>>>>
>>>>> Webrev : http://cr.openjdk.java.net/~vdeshpande/8078122/webrev.00
>>>>>
>>>>> The patch changes the definitions of registers XMM6-XMM31 for WIN64.
>>>>>
>>>>> Thank you.
>>>>>
>>>>> Kishor
>>>>>

From doug.simon at oracle.com  Thu Sep 22 07:54:03 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Thu, 22 Sep 2016 09:54:03 +0200
Subject: RFR: 8166517: [JVMCI] export JVMCI to auto-detected JVMCI compiler
Message-ID: <C4C7CCED-4AF8-4C95-8846-40284551320A@oracle.com>

When JVMCI compiler auto-selection (JDK-8160730) is used, then JVMCI needs to be exported to the selected compiler that same as way if the -Djvmci.Compiler property was specified.

https://bugs.openjdk.java.net/browse/JDK-8166517
http://cr.openjdk.java.net/~dnsimon/8166517/

-Doug

From goetz.lindenmaier at sap.com  Thu Sep 22 10:06:09 2016
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Thu, 22 Sep 2016 10:06:09 +0000
Subject: RFR(m) 8164920: ppc: enhancement of CRC32 intrinsic
In-Reply-To: <OF8FF66E87.7F5BDE86-ON49258034.005D75B1-49258034.005DAFF6@notes.na.collabserv.com>
References: <OF3DBB7B4B.62D23BD5-ON49258026.0031B271-49258026.0051807D@notes.na.collabserv.com>
	<fdca33bc870f47e89768ef6cd96aba7f@DEWDFE13DE14.global.corp.sap>
	<OF069946E4.2F8AAC40-ON49258032.006041F6-49258032.0060B791@LocalDomain>
	<OF8FF66E87.7F5BDE86-ON49258034.005D75B1-49258034.005DAFF6@notes.na.collabserv.com>
Message-ID: <919b03528ad546a8996e39ed7a737ebb@DEWDFE13DE50.global.corp.sap>

Hi Hiroshi,

I had a look at your change. While I can't tell whether the algorithm is correct,
I can state all our tests are green.  Spilling the registers to stack is fine as there
is the ShadowZone on the stack.

I would appreciate if it would be ported for big endian, too, as a follow up
though.

Reviewed.

Best regards,
  Goetz


> -----Original Message-----
> From: Hiroshi H Horii [mailto:HORII at jp.ibm.com]
> Sent: Dienstag, 20. September 2016 19:03
> To: Doerr, Martin <martin.doerr at sap.com>
> Cc: Gustavo Bueno Romero <gromero at br.ibm.com>; hotspot-compiler-
> dev at openjdk.java.net; Volker Simonis (volker.simonis at gmail.com)
> <volker.simonis at gmail.com>; Lindenmaier, Goetz
> <goetz.lindenmaier at sap.com>
> Subject: RE: RFR(m) 8164920: ppc: enhancement of CRC32 intrinsic
> 
> Hi all,
> 
> Martin thankfully created a webrev with some good correction.
> http://cr.openjdk.java.net/~mdoerr/8164920_ppc_crc32/webrev.01/
> <http://cr.openjdk.java.net/~mdoerr/8164920_ppc_crc32/webrev.01/>
> 
> Could someone review this change again?
> 
> Regards,
> Hiroshi
> -----------------------
> Hiroshi Horii, Ph.D.
> IBM Research - Tokyo
> 
> 
> 
> 
> From:        Hiroshi H Horii/Japan/IBM
> To:        "Doerr, Martin" <martin.doerr at sap.com>
> Cc:        Gustavo Bueno Romero <gromero at br.ibm.com>, "hotspot-compiler-
> dev at openjdk.java.net" <hotspot-compiler-dev at openjdk.java.net>,
> "Volker Simonis (volker.simonis at gmail.com)" <volker.simonis at gmail.com>
> Date:        09/19/2016 02:36
> Subject:        RE: RFR(m) 8164920: ppc: enhancement of CRC32 intrinsic
> 
> ________________________________
> 
> 
> 
> Hi Martin, and all
> (Please allow me to send this mail twice. The first mail is awaiting because it
> exceeded 100KB)
> 
> Thank you for your reviewing. Gustavo and I recreated a new change based
> on your comments. I would like to request a review again.
> 
> My account of cr server is not available now (because of my mistake...) and
> Gustavo cannot create a webrev file with another reason. I would like to
> attach a diff file created with "hg diff -g" in hotspot.  If possible, could
> someone create a webrev file with this changeset?
> 
> [attachment "hotspot.crc32.20160918.changeset" deleted by Hiroshi H
> Horii/Japan/IBM]
> 
> Regards,
> Hiroshi
> -----------------------
> Hiroshi Horii, Ph.D.
> IBM Research - Tokyo
> 
> "Doerr, Martin" <martin.doerr at sap.com> wrote on 09/13/2016 18:35:09:
> 
> > From: "Doerr, Martin" <martin.doerr at sap.com>
> > To: Hiroshi H Horii/Japan/IBM at IBMJP, "hotspot-compiler-
> > dev at openjdk.java.net" <hotspot-compiler-dev at openjdk.java.net>
> > Cc: "Volker Simonis (volker.simonis at gmail.com)"
> > <volker.simonis at gmail.com>, Gustavo Bueno Romero
> <gromero at br.ibm.com>
> > Date: 09/13/2016 18:36
> > Subject: RE: RFR(m) 8164920: ppc: enhancement of CRC32 intrinsic
> >
> > Hi Hiroshi,
> >
> > we appreciate your change. Thanks for contributing it.
> > It basically looks good, but I?d like to propose some minor improvements.
> >
> >
> > kernel_crc32_1word_vpmsumd:
> >
> > 1. The Pre-align code can be implemented shorter:
> >     clrldi_(prealign, buf, 57);
> >     beq(CCR0, L_alignHead);
> >
> >     subfic(prealign, prealign, 128);
> >
> > 2. I'd prefer the label name ?L_alignedHead?.
> >
> > 3. The branch b(L_alignTail) and the label are not needed and should
> > get removed.
> >
> >
> > kernel_crc32_1word_aligned:
> >
> > 1. When saving and restoring non-volatile vector register, please
> > use offset differences of -16 instead of -32.
> > (The ABI allows up to 288 bytes to be used in frameless functions so
> > it will fit if -16 is used.)
> >
> > 2. The std instructions should better be used with int offsets so
> > you can get rid of the addi(offset, offset, -8) instructions.
> >
> >
> > Comments:
> > For single line comments "//" should be used instead of "/*". Would
> > be nice if you could change them.
> >
> >
> > Thanks and best regards,
> > Martin
> >
> >
> > From: Hiroshi H Horii [mailto:HORII at jp.ibm.com
> <mailto:HORII at jp.ibm.com> ]
> > Sent: Dienstag, 6. September 2016 16:50
> > To: hotspot-compiler-dev at openjdk.java.net; vladimir.kozlov at oracle.com
> > Cc: Volker Simonis (volker.simonis at gmail.com)
> > <volker.simonis at gmail.com>; Doerr, Martin <martin.doerr at sap.com>;
> > Gustavo Bueno Romero <gromero at br.ibm.com>
> > Subject: RFR(m) 8164920: ppc: enhancement of CRC32 intrinsic
> >
> > Dear Vladimir and all:
> >
> > Can I please request reviews for the following change?
> >
> > JIRA: https://bugs.openjdk.java.net/browse/JDK-8164920
> <https://bugs.openjdk.java.net/browse/JDK-8164920>
> > webrev: http://cr.openjdk.java.net/~gromero/8164920/01/
> <http://cr.openjdk.java.net/~gromero/8164920/01/>
> >
> > As Volker's comments in the above JIRA, this is a ppc64-only
> > improvement which will not
> > affect any of the Oracle platforms in any way.
> >
> > This change includes new implementation of CRC32 Intrinsics for ppc64le.
> > In my local experiment, CRC32 of 64KB was calculated more than 20
> > times faster than original.
> > Performance of CRC32 Intrinsic is important to run recent Apache
> Cassandra.
> > A Cassandra daemon needs to read 64KB data from a disk with CRC32
> > checksum by default.
> >
> > This JIRA entry has "jdk9-fc-request" label.
> > If there is a chance to include new change in JDK 9 for ppc64le, I
> > would like to request
> > a review for this change.
> >
> > Regards,
> > Hiroshi
> > -----------------------
> > Hiroshi Horii, Ph.D.
> > IBM Research - Tokyo
> 
> 


From martin.doerr at sap.com  Thu Sep 22 10:16:00 2016
From: martin.doerr at sap.com (Doerr, Martin)
Date: Thu, 22 Sep 2016 10:16:00 +0000
Subject: RFR(m) 8164920: ppc: enhancement of CRC32 intrinsic
In-Reply-To: <919b03528ad546a8996e39ed7a737ebb@DEWDFE13DE50.global.corp.sap>
References: <OF3DBB7B4B.62D23BD5-ON49258026.0031B271-49258026.0051807D@notes.na.collabserv.com>
	<fdca33bc870f47e89768ef6cd96aba7f@DEWDFE13DE14.global.corp.sap>
	<OF069946E4.2F8AAC40-ON49258032.006041F6-49258032.0060B791@LocalDomain>
	<OF8FF66E87.7F5BDE86-ON49258034.005D75B1-49258034.005DAFF6@notes.na.collabserv.com>
	<919b03528ad546a8996e39ed7a737ebb@DEWDFE13DE50.global.corp.sap>
Message-ID: <de8417a1c8c04a3abee17e7d4c2a027d@DEWDFE13DE14.global.corp.sap>

Hi all,

thanks a lot for the contribution and for reviewing.

8164920 has the label jdk9-fc-yes, a second review and the tests have passed. I will push it.

Best regards,
Martin

-----Original Message-----
From: Lindenmaier, Goetz 
Sent: Donnerstag, 22. September 2016 12:06
To: Hiroshi H Horii <HORII at jp.ibm.com>; Doerr, Martin <martin.doerr at sap.com>
Cc: Gustavo Bueno Romero <gromero at br.ibm.com>; hotspot-compiler-dev at openjdk.java.net; Volker Simonis (volker.simonis at gmail.com) <volker.simonis at gmail.com>
Subject: RE: RFR(m) 8164920: ppc: enhancement of CRC32 intrinsic

Hi Hiroshi,

I had a look at your change. While I can't tell whether the algorithm is correct,
I can state all our tests are green.  Spilling the registers to stack is fine as there
is the ShadowZone on the stack.

I would appreciate if it would be ported for big endian, too, as a follow up
though.

Reviewed.

Best regards,
  Goetz


> -----Original Message-----
> From: Hiroshi H Horii [mailto:HORII at jp.ibm.com]
> Sent: Dienstag, 20. September 2016 19:03
> To: Doerr, Martin <martin.doerr at sap.com>
> Cc: Gustavo Bueno Romero <gromero at br.ibm.com>; hotspot-compiler-
> dev at openjdk.java.net; Volker Simonis (volker.simonis at gmail.com)
> <volker.simonis at gmail.com>; Lindenmaier, Goetz
> <goetz.lindenmaier at sap.com>
> Subject: RE: RFR(m) 8164920: ppc: enhancement of CRC32 intrinsic
> 
> Hi all,
> 
> Martin thankfully created a webrev with some good correction.
> http://cr.openjdk.java.net/~mdoerr/8164920_ppc_crc32/webrev.01/
> <http://cr.openjdk.java.net/~mdoerr/8164920_ppc_crc32/webrev.01/>
> 
> Could someone review this change again?
> 
> Regards,
> Hiroshi
> -----------------------
> Hiroshi Horii, Ph.D.
> IBM Research - Tokyo
> 
> 
> 
> 
> From:        Hiroshi H Horii/Japan/IBM
> To:        "Doerr, Martin" <martin.doerr at sap.com>
> Cc:        Gustavo Bueno Romero <gromero at br.ibm.com>, "hotspot-compiler-
> dev at openjdk.java.net" <hotspot-compiler-dev at openjdk.java.net>,
> "Volker Simonis (volker.simonis at gmail.com)" <volker.simonis at gmail.com>
> Date:        09/19/2016 02:36
> Subject:        RE: RFR(m) 8164920: ppc: enhancement of CRC32 intrinsic
> 
> ________________________________
> 
> 
> 
> Hi Martin, and all
> (Please allow me to send this mail twice. The first mail is awaiting because it
> exceeded 100KB)
> 
> Thank you for your reviewing. Gustavo and I recreated a new change based
> on your comments. I would like to request a review again.
> 
> My account of cr server is not available now (because of my mistake...) and
> Gustavo cannot create a webrev file with another reason. I would like to
> attach a diff file created with "hg diff -g" in hotspot.  If possible, could
> someone create a webrev file with this changeset?
> 
> [attachment "hotspot.crc32.20160918.changeset" deleted by Hiroshi H
> Horii/Japan/IBM]
> 
> Regards,
> Hiroshi
> -----------------------
> Hiroshi Horii, Ph.D.
> IBM Research - Tokyo
> 
> "Doerr, Martin" <martin.doerr at sap.com> wrote on 09/13/2016 18:35:09:
> 
> > From: "Doerr, Martin" <martin.doerr at sap.com>
> > To: Hiroshi H Horii/Japan/IBM at IBMJP, "hotspot-compiler-
> > dev at openjdk.java.net" <hotspot-compiler-dev at openjdk.java.net>
> > Cc: "Volker Simonis (volker.simonis at gmail.com)"
> > <volker.simonis at gmail.com>, Gustavo Bueno Romero
> <gromero at br.ibm.com>
> > Date: 09/13/2016 18:36
> > Subject: RE: RFR(m) 8164920: ppc: enhancement of CRC32 intrinsic
> >
> > Hi Hiroshi,
> >
> > we appreciate your change. Thanks for contributing it.
> > It basically looks good, but I?d like to propose some minor improvements.
> >
> >
> > kernel_crc32_1word_vpmsumd:
> >
> > 1. The Pre-align code can be implemented shorter:
> >     clrldi_(prealign, buf, 57);
> >     beq(CCR0, L_alignHead);
> >
> >     subfic(prealign, prealign, 128);
> >
> > 2. I'd prefer the label name ?L_alignedHead?.
> >
> > 3. The branch b(L_alignTail) and the label are not needed and should
> > get removed.
> >
> >
> > kernel_crc32_1word_aligned:
> >
> > 1. When saving and restoring non-volatile vector register, please
> > use offset differences of -16 instead of -32.
> > (The ABI allows up to 288 bytes to be used in frameless functions so
> > it will fit if -16 is used.)
> >
> > 2. The std instructions should better be used with int offsets so
> > you can get rid of the addi(offset, offset, -8) instructions.
> >
> >
> > Comments:
> > For single line comments "//" should be used instead of "/*". Would
> > be nice if you could change them.
> >
> >
> > Thanks and best regards,
> > Martin
> >
> >
> > From: Hiroshi H Horii [mailto:HORII at jp.ibm.com
> <mailto:HORII at jp.ibm.com> ]
> > Sent: Dienstag, 6. September 2016 16:50
> > To: hotspot-compiler-dev at openjdk.java.net; vladimir.kozlov at oracle.com
> > Cc: Volker Simonis (volker.simonis at gmail.com)
> > <volker.simonis at gmail.com>; Doerr, Martin <martin.doerr at sap.com>;
> > Gustavo Bueno Romero <gromero at br.ibm.com>
> > Subject: RFR(m) 8164920: ppc: enhancement of CRC32 intrinsic
> >
> > Dear Vladimir and all:
> >
> > Can I please request reviews for the following change?
> >
> > JIRA: https://bugs.openjdk.java.net/browse/JDK-8164920
> <https://bugs.openjdk.java.net/browse/JDK-8164920>
> > webrev: http://cr.openjdk.java.net/~gromero/8164920/01/
> <http://cr.openjdk.java.net/~gromero/8164920/01/>
> >
> > As Volker's comments in the above JIRA, this is a ppc64-only
> > improvement which will not
> > affect any of the Oracle platforms in any way.
> >
> > This change includes new implementation of CRC32 Intrinsics for ppc64le.
> > In my local experiment, CRC32 of 64KB was calculated more than 20
> > times faster than original.
> > Performance of CRC32 Intrinsic is important to run recent Apache
> Cassandra.
> > A Cassandra daemon needs to read 64KB data from a disk with CRC32
> > checksum by default.
> >
> > This JIRA entry has "jdk9-fc-request" label.
> > If there is a chance to include new change in JDK 9 for ppc64le, I
> > would like to request
> > a review for this change.
> >
> > Regards,
> > Hiroshi
> > -----------------------
> > Hiroshi Horii, Ph.D.
> > IBM Research - Tokyo
> 
> 


From jamsheed.c.m at oracle.com  Thu Sep 22 16:22:18 2016
From: jamsheed.c.m at oracle.com (Jamsheed C m)
Date: Thu, 22 Sep 2016 21:52:18 +0530
Subject: RFR: 8134389: Crash in HotSpot with jvm.dll+0x42b48
	ciObjectFactory::create_new_metadata
In-Reply-To: <3d9039ee-84a2-d4e8-e36a-08f12c4cd504@oracle.com>
References: <bad02f78-e54b-b61e-72db-f435c02c5e10@oracle.com>
	<05c82c51-9525-eec7-206e-a265c7d47194@oracle.com>
	<7c1a8b01-b4ec-ea23-b59a-500c1bfd5dbc@oracle.com>
	<3d9039ee-84a2-d4e8-e36a-08f12c4cd504@oracle.com>
Message-ID: <087568ef-70b1-2938-eb27-784264d0ec39@oracle.com>

Hi Vladimir,

Thanks for the review,


On 9/19/2016 10:08 PM, Vladimir Ivanov wrote:
> Overall, the fix looks good.
>
> Some nitpicks:
>  (1) I'd prefer to avoid using ciMethod::is_compiled_lambda_form();
Ok. Hope there is no correctness reasons behind this?
>  (2) align with other uses of TypeCast for method handles.
There was a bug in closed arm port blocking this(8166441). i was getting 
failure in arm-32 closed port. its fixed and is out on review.
The issue was with null constant getting typecast. i chose alternate 
implementation to avoid all those cases. i am Ok for aligning with 
previous typecast implementation.

hope i needn't send  updated webrev?

Best Regards,
Jamsheed

>
> Also, ciType::is_klass() can be replaced with 
> !ciType::is_primitive_type() check, but IMO it doesn't matter much.
>
> Something like the following:
>
> diff --git a/src/share/vm/c1/c1_GraphBuilder.cpp 
> b/src/share/vm/c1/c1_GraphBuilder.cpp
> --- a/src/share/vm/c1/c1_GraphBuilder.cpp
> +++ b/src/share/vm/c1/c1_GraphBuilder.cpp
> @@ -1493,6 +1493,24 @@
>    // Check to see whether we are inlining. If so, Return
>    // instructions become Gotos to the continuation point.
>    if (continuation() != NULL) {
> +
> +    int invoke_bci = state()->caller_state()->bci();
> +
> +    if (x != NULL && !ignore_return) {
> +      ciMethod* caller = state()->scope()->caller()->method();
> +      Bytecodes::Code invoke_raw_bc = 
> caller->raw_code_at_bci(invoke_bci);
> +      if (invoke_raw_bc == Bytecodes::_invokehandle ||
> +          invoke_raw_bc == Bytecodes::_invokedynamic) {
> +        ciType* declared_ret_type = 
> caller->get_declared_signature_at_bci(invoke_bci)->return_type();
> +        if (declared_ret_type->is_klass() &&
> +            x->exact_type() == NULL &&
> +            x->declared_type() != declared_ret_type &&
> +            declared_ret_type != compilation()->env()->Object_klass()) {
> +          x = append(new TypeCast(declared_ret_type->as_klass(), x, 
> copy_state_before()));
> +        }
> +      }
> +    }
> +
>      assert(!method()->is_synchronized() || InlineSynchronizedMethods, 
> "can not inline synchronized methods yet");
>
>      if (compilation()->env()->dtrace_method_probes()) {
> @@ -1516,7 +1534,6 @@
>      // State at end of inlined method is the state of the caller
>      // without the method parameters on stack, including the
>      // return value, if any, of the inlined method on operand stack.
> -    int invoke_bci = state()->caller_state()->bci();
>      set_state(state()->caller_state()->copy_for_parsing());
>      if (x != NULL) {
>        if (!ignore_return) {
> diff --git a/src/share/vm/c1/c1_Instruction.cpp 
> b/src/share/vm/c1/c1_Instruction.cpp
> --- a/src/share/vm/c1/c1_Instruction.cpp
> +++ b/src/share/vm/c1/c1_Instruction.cpp
> @@ -360,7 +360,8 @@
>  }
>
>  ciType* Invoke::declared_type() const {
> -  ciType *t = _target->signature()->return_type();
> +  ciSignature* declared_signature = 
> state()->scope()->method()->get_declared_signature_at_bci(state()->bci());
> +  ciType *t = declared_signature->return_type();
>    assert(t->basic_type() != T_VOID, "need return value of void 
> method?");
>    return t;
>  }
> diff --git a/src/share/vm/ci/ciMethod.hpp b/src/share/vm/ci/ciMethod.hpp
> --- a/src/share/vm/ci/ciMethod.hpp
> +++ b/src/share/vm/ci/ciMethod.hpp
> @@ -255,6 +255,12 @@
>      ciSignature* ignored_declared_signature;
>      return get_method_at_bci(bci, ignored_will_link, 
> &ignored_declared_signature);
>    }
> +  ciSignature*  get_declared_signature_at_bci(int bci) {
> +    bool ignored_will_link;
> +    ciSignature* declared_signature;
> +    get_method_at_bci(bci, ignored_will_link, &declared_signature);
> +    return declared_signature;
> +  }
>
>    // Given a certain calling environment, find the monomorphic target
>    // for the call.  Return NULL if the call is not monomorphic in
>
> Best regards,
> Vladimir Ivanov
>
> On 9/11/16 2:51 PM, Jamsheed C m wrote:
>> i made some changes to my fix. webrev is updated in place.
>>
>> pit results with latest modification updated in bug(not still completed)
>>
>> Best Regards,
>>
>> Jamsheed
>>
>>
>> On 9/10/2016 3:53 AM, Jamsheed C m wrote:
>>>
>>> adding a little more description as per my understanding
>>>
>>> This issue can happen only for compiled lforms not inlined case
>>>
>>> there are two scenarios.
>>> 1) no compiled lforms inlined
>>> 2) some compiled lforms are inlined or  final method is not inlined
>>> (linkTo* not inlined)..    (i.e partially inlined)
>>>
>>> in all these cases *Invoke instruction* will be *return Value*. and
>>> will have erased type.
>>> so we reify return type either by type casting(for partially inlined
>>> case) or by directly pulling from callsite MT.
>>>
>>> Best Regards,
>>>
>>> Jamsheed
>>>
>>>
>>> On 9/8/2016 3:26 PM, Jamsheed C m wrote:
>>>> Hi All,
>>>>
>>>> bugid: https://bugs.openjdk.java.net/browse/JDK-8134389
>>>>
>>>> webrev: http://cr.openjdk.java.net/~jcm/8134389/webrev.00/
>>>>
>>>> return type information is not available in lforms, this causes
>>>> contradictions in operation like  store indexed. mh _linkTo* site arg
>>>> type casting. etc..
>>>>
>>>> fix: TypeCast to declared return type at lform return.
>>>>
>>>> Best Regards,
>>>>
>>>> Jamsheed
>>>>
>>>
>>


From igor.ignatyev at oracle.com  Thu Sep 22 17:56:13 2016
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Thu, 22 Sep 2016 20:56:13 +0300
Subject: RFR(XXS) : 8166549 : fix incorrectly @ignore-d hotspot/compiler tests
Message-ID: <AD4DD936-8DA6-48B2-9994-1189203D007B@oracle.com>

http://cr.openjdk.java.net/~iignatyev/8166549/webrev.00/
> 1 line changed: 0 ins; 0 del; 1 mod;

Hi all,

could you please review this tiny patch which corrects bug id used in @ignore?
compiler/codecache/stress/OverloadCompileQueueTest.java was @ignored due to JDK-8071905[1], but JDK-8071905 is closed as a dup of JDK-8079586[2] which is fixed. the test still has a problem (it can timeout), so I have filed a new bug[3] and used its id in @ignore.

JBS: https://bugs.openjdk.java.net/browse/JDK-8166549
webrev: http://cr.openjdk.java.net/~iignatyev/8166549/webrev.00/

[1] https://bugs.openjdk.java.net/browse/JDK-8071905
[2] https://bugs.openjdk.java.net/browse/JDK-8079586
[3] https://bugs.openjdk.java.net/browse/JDK-8166554

Thanks,
? Igor

From kirill.zhaldybin at oracle.com  Thu Sep 22 18:08:11 2016
From: kirill.zhaldybin at oracle.com (Kirill Zhaldybin)
Date: Thu, 22 Sep 2016 21:08:11 +0300
Subject: RFR(XXS) : 8166549 : fix incorrectly @ignore-d hotspot/compiler
	tests
In-Reply-To: <AD4DD936-8DA6-48B2-9994-1189203D007B@oracle.com>
References: <AD4DD936-8DA6-48B2-9994-1189203D007B@oracle.com>
Message-ID: <e559e873-c372-021b-0eab-ef006427caf3@oracle.com>

Igor,

Looks good to me.

Regards, Kirill

On 22.09.2016 20:56, Igor Ignatyev wrote:
> http://cr.openjdk.java.net/~iignatyev/8166549/webrev.00/
>> 1 line changed: 0 ins; 0 del; 1 mod;
> Hi all,
>
> could you please review this tiny patch which corrects bug id used in @ignore?
> compiler/codecache/stress/OverloadCompileQueueTest.java was @ignored due to JDK-8071905[1], but JDK-8071905 is closed as a dup of JDK-8079586[2] which is fixed. the test still has a problem (it can timeout), so I have filed a new bug[3] and used its id in @ignore.
>
> JBS: https://bugs.openjdk.java.net/browse/JDK-8166549
> webrev: http://cr.openjdk.java.net/~iignatyev/8166549/webrev.00/
>
> [1] https://bugs.openjdk.java.net/browse/JDK-8071905
> [2] https://bugs.openjdk.java.net/browse/JDK-8079586
> [3] https://bugs.openjdk.java.net/browse/JDK-8166554
>
> Thanks,
> ? Igor


From vladimir.x.ivanov at oracle.com  Thu Sep 22 18:16:56 2016
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Thu, 22 Sep 2016 21:16:56 +0300
Subject: RFR: 8134389: Crash in HotSpot with jvm.dll+0x42b48
	ciObjectFactory::create_new_metadata
In-Reply-To: <087568ef-70b1-2938-eb27-784264d0ec39@oracle.com>
References: <bad02f78-e54b-b61e-72db-f435c02c5e10@oracle.com>
	<05c82c51-9525-eec7-206e-a265c7d47194@oracle.com>
	<7c1a8b01-b4ec-ea23-b59a-500c1bfd5dbc@oracle.com>
	<3d9039ee-84a2-d4e8-e36a-08f12c4cd504@oracle.com>
	<087568ef-70b1-2938-eb27-784264d0ec39@oracle.com>
Message-ID: <59650400-0347-12c7-1a26-4efb99d5b29e@oracle.com>

>>  (1) I'd prefer to avoid using ciMethod::is_compiled_lambda_form();
> Ok. Hope there is no correctness reasons behind this?

No, it simply doesn't communicate the intention clearly enough.

All LambdaForms are marked w/ @Compiled, but we are only interested in 
invokers (indy, exact & genertic invokers).

>>  (2) align with other uses of TypeCast for method handles.
> There was a bug in closed arm port blocking this(8166441). i was getting
> failure in arm-32 closed port. its fixed and is out on review.
> The issue was with null constant getting typecast. i chose alternate
> implementation to avoid all those cases. i am Ok for aligning with
> previous typecast implementation.

Good.

> hope i needn't send  updated webrev?

No need to send new webrev.

Best regards,
Vladimir Ivanov

>
> Best Regards,
> Jamsheed
>
>>
>> Also, ciType::is_klass() can be replaced with
>> !ciType::is_primitive_type() check, but IMO it doesn't matter much.
>>
>> Something like the following:
>>
>> diff --git a/src/share/vm/c1/c1_GraphBuilder.cpp
>> b/src/share/vm/c1/c1_GraphBuilder.cpp
>> --- a/src/share/vm/c1/c1_GraphBuilder.cpp
>> +++ b/src/share/vm/c1/c1_GraphBuilder.cpp
>> @@ -1493,6 +1493,24 @@
>>    // Check to see whether we are inlining. If so, Return
>>    // instructions become Gotos to the continuation point.
>>    if (continuation() != NULL) {
>> +
>> +    int invoke_bci = state()->caller_state()->bci();
>> +
>> +    if (x != NULL && !ignore_return) {
>> +      ciMethod* caller = state()->scope()->caller()->method();
>> +      Bytecodes::Code invoke_raw_bc =
>> caller->raw_code_at_bci(invoke_bci);
>> +      if (invoke_raw_bc == Bytecodes::_invokehandle ||
>> +          invoke_raw_bc == Bytecodes::_invokedynamic) {
>> +        ciType* declared_ret_type =
>> caller->get_declared_signature_at_bci(invoke_bci)->return_type();
>> +        if (declared_ret_type->is_klass() &&
>> +            x->exact_type() == NULL &&
>> +            x->declared_type() != declared_ret_type &&
>> +            declared_ret_type != compilation()->env()->Object_klass()) {
>> +          x = append(new TypeCast(declared_ret_type->as_klass(), x,
>> copy_state_before()));
>> +        }
>> +      }
>> +    }
>> +
>>      assert(!method()->is_synchronized() || InlineSynchronizedMethods,
>> "can not inline synchronized methods yet");
>>
>>      if (compilation()->env()->dtrace_method_probes()) {
>> @@ -1516,7 +1534,6 @@
>>      // State at end of inlined method is the state of the caller
>>      // without the method parameters on stack, including the
>>      // return value, if any, of the inlined method on operand stack.
>> -    int invoke_bci = state()->caller_state()->bci();
>>      set_state(state()->caller_state()->copy_for_parsing());
>>      if (x != NULL) {
>>        if (!ignore_return) {
>> diff --git a/src/share/vm/c1/c1_Instruction.cpp
>> b/src/share/vm/c1/c1_Instruction.cpp
>> --- a/src/share/vm/c1/c1_Instruction.cpp
>> +++ b/src/share/vm/c1/c1_Instruction.cpp
>> @@ -360,7 +360,8 @@
>>  }
>>
>>  ciType* Invoke::declared_type() const {
>> -  ciType *t = _target->signature()->return_type();
>> +  ciSignature* declared_signature =
>> state()->scope()->method()->get_declared_signature_at_bci(state()->bci());
>>
>> +  ciType *t = declared_signature->return_type();
>>    assert(t->basic_type() != T_VOID, "need return value of void
>> method?");
>>    return t;
>>  }
>> diff --git a/src/share/vm/ci/ciMethod.hpp b/src/share/vm/ci/ciMethod.hpp
>> --- a/src/share/vm/ci/ciMethod.hpp
>> +++ b/src/share/vm/ci/ciMethod.hpp
>> @@ -255,6 +255,12 @@
>>      ciSignature* ignored_declared_signature;
>>      return get_method_at_bci(bci, ignored_will_link,
>> &ignored_declared_signature);
>>    }
>> +  ciSignature*  get_declared_signature_at_bci(int bci) {
>> +    bool ignored_will_link;
>> +    ciSignature* declared_signature;
>> +    get_method_at_bci(bci, ignored_will_link, &declared_signature);
>> +    return declared_signature;
>> +  }
>>
>>    // Given a certain calling environment, find the monomorphic target
>>    // for the call.  Return NULL if the call is not monomorphic in
>>
>> Best regards,
>> Vladimir Ivanov
>>
>> On 9/11/16 2:51 PM, Jamsheed C m wrote:
>>> i made some changes to my fix. webrev is updated in place.
>>>
>>> pit results with latest modification updated in bug(not still completed)
>>>
>>> Best Regards,
>>>
>>> Jamsheed
>>>
>>>
>>> On 9/10/2016 3:53 AM, Jamsheed C m wrote:
>>>>
>>>> adding a little more description as per my understanding
>>>>
>>>> This issue can happen only for compiled lforms not inlined case
>>>>
>>>> there are two scenarios.
>>>> 1) no compiled lforms inlined
>>>> 2) some compiled lforms are inlined or  final method is not inlined
>>>> (linkTo* not inlined)..    (i.e partially inlined)
>>>>
>>>> in all these cases *Invoke instruction* will be *return Value*. and
>>>> will have erased type.
>>>> so we reify return type either by type casting(for partially inlined
>>>> case) or by directly pulling from callsite MT.
>>>>
>>>> Best Regards,
>>>>
>>>> Jamsheed
>>>>
>>>>
>>>> On 9/8/2016 3:26 PM, Jamsheed C m wrote:
>>>>> Hi All,
>>>>>
>>>>> bugid: https://bugs.openjdk.java.net/browse/JDK-8134389
>>>>>
>>>>> webrev: http://cr.openjdk.java.net/~jcm/8134389/webrev.00/
>>>>>
>>>>> return type information is not available in lforms, this causes
>>>>> contradictions in operation like  store indexed. mh _linkTo* site arg
>>>>> type casting. etc..
>>>>>
>>>>> fix: TypeCast to declared return type at lform return.
>>>>>
>>>>> Best Regards,
>>>>>
>>>>> Jamsheed
>>>>>
>>>>
>>>
>

From vladimir.kozlov at oracle.com  Thu Sep 22 18:32:23 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 22 Sep 2016 11:32:23 -0700
Subject: RFR(XXS) : 8166549 : fix incorrectly @ignore-d hotspot/compiler
	tests
In-Reply-To: <AD4DD936-8DA6-48B2-9994-1189203D007B@oracle.com>
References: <AD4DD936-8DA6-48B2-9994-1189203D007B@oracle.com>
Message-ID: <57E423B7.5080107@oracle.com>

Good.

Thanks,
Vladimir

On 9/22/16 10:56 AM, Igor Ignatyev wrote:
> http://cr.openjdk.java.net/~iignatyev/8166549/webrev.00/
>> 1 line changed: 0 ins; 0 del; 1 mod;
>
> Hi all,
>
> could you please review this tiny patch which corrects bug id used in @ignore?
> compiler/codecache/stress/OverloadCompileQueueTest.java was @ignored due to JDK-8071905[1], but JDK-8071905 is closed as a dup of JDK-8079586[2] which is fixed. the test still has a problem (it can timeout), so I have filed a new bug[3] and used its id in @ignore.
>
> JBS: https://bugs.openjdk.java.net/browse/JDK-8166549
> webrev: http://cr.openjdk.java.net/~iignatyev/8166549/webrev.00/
>
> [1] https://bugs.openjdk.java.net/browse/JDK-8071905
> [2] https://bugs.openjdk.java.net/browse/JDK-8079586
> [3] https://bugs.openjdk.java.net/browse/JDK-8166554
>
> Thanks,
> ? Igor
>

From igor.ignatyev at oracle.com  Thu Sep 22 19:05:34 2016
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Thu, 22 Sep 2016 22:05:34 +0300
Subject: RFR(XXS) : 8166549 : fix incorrectly @ignore-d hotspot/compiler
	tests
In-Reply-To: <e559e873-c372-021b-0eab-ef006427caf3@oracle.com>
References: <AD4DD936-8DA6-48B2-9994-1189203D007B@oracle.com>
	<e559e873-c372-021b-0eab-ef006427caf3@oracle.com>
Message-ID: <975B41AA-2B5E-43FD-AB97-9D3C3ABE2169@oracle.com>

Kirill,

thank you.

? Igor
> On Sep 22, 2016, at 9:08 PM, Kirill Zhaldybin <kirill.zhaldybin at oracle.com> wrote:
> 
> Igor,
> 
> Looks good to me.
> 
> Regards, Kirill
> 
> On 22.09.2016 20:56, Igor Ignatyev wrote:
>> http://cr.openjdk.java.net/~iignatyev/8166549/webrev.00/
>>> 1 line changed: 0 ins; 0 del; 1 mod;
>> Hi all,
>> 
>> could you please review this tiny patch which corrects bug id used in @ignore?
>> compiler/codecache/stress/OverloadCompileQueueTest.java was @ignored due to JDK-8071905[1], but JDK-8071905 is closed as a dup of JDK-8079586[2] which is fixed. the test still has a problem (it can timeout), so I have filed a new bug[3] and used its id in @ignore.
>> 
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8166549
>> webrev: http://cr.openjdk.java.net/~iignatyev/8166549/webrev.00/
>> 
>> [1] https://bugs.openjdk.java.net/browse/JDK-8071905
>> [2] https://bugs.openjdk.java.net/browse/JDK-8079586
>> [3] https://bugs.openjdk.java.net/browse/JDK-8166554
>> 
>> Thanks,
>> ? Igor
> 


From igor.ignatyev at oracle.com  Thu Sep 22 19:05:50 2016
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Thu, 22 Sep 2016 22:05:50 +0300
Subject: RFR(XXS) : 8166549 : fix incorrectly @ignore-d hotspot/compiler
	tests
In-Reply-To: <57E423B7.5080107@oracle.com>
References: <AD4DD936-8DA6-48B2-9994-1189203D007B@oracle.com>
	<57E423B7.5080107@oracle.com>
Message-ID: <57321D41-26E8-4713-8923-1BEC64C63E99@oracle.com>

Vladmir,

Thank you for review,

? Igor
> On Sep 22, 2016, at 9:32 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
> Good.
> 
> Thanks,
> Vladimir
> 
> On 9/22/16 10:56 AM, Igor Ignatyev wrote:
>> http://cr.openjdk.java.net/~iignatyev/8166549/webrev.00/
>>> 1 line changed: 0 ins; 0 del; 1 mod;
>> 
>> Hi all,
>> 
>> could you please review this tiny patch which corrects bug id used in @ignore?
>> compiler/codecache/stress/OverloadCompileQueueTest.java was @ignored due to JDK-8071905[1], but JDK-8071905 is closed as a dup of JDK-8079586[2] which is fixed. the test still has a problem (it can timeout), so I have filed a new bug[3] and used its id in @ignore.
>> 
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8166549
>> webrev: http://cr.openjdk.java.net/~iignatyev/8166549/webrev.00/
>> 
>> [1] https://bugs.openjdk.java.net/browse/JDK-8071905
>> [2] https://bugs.openjdk.java.net/browse/JDK-8079586
>> [3] https://bugs.openjdk.java.net/browse/JDK-8166554
>> 
>> Thanks,
>> ? Igor
>> 


From jamsheed.c.m at oracle.com  Fri Sep 23 07:13:25 2016
From: jamsheed.c.m at oracle.com (Jamsheed C m)
Date: Fri, 23 Sep 2016 12:43:25 +0530
Subject: RFR: 8134389: Crash in HotSpot with jvm.dll+0x42b48
	ciObjectFactory::create_new_metadata
In-Reply-To: <59650400-0347-12c7-1a26-4efb99d5b29e@oracle.com>
References: <bad02f78-e54b-b61e-72db-f435c02c5e10@oracle.com>
	<05c82c51-9525-eec7-206e-a265c7d47194@oracle.com>
	<7c1a8b01-b4ec-ea23-b59a-500c1bfd5dbc@oracle.com>
	<3d9039ee-84a2-d4e8-e36a-08f12c4cd504@oracle.com>
	<087568ef-70b1-2938-eb27-784264d0ec39@oracle.com>
	<59650400-0347-12c7-1a26-4efb99d5b29e@oracle.com>
Message-ID: <0ab782c5-036a-9374-e39f-5980011849dd@oracle.com>

Thanks for clarification, Vladimir Ivanov.

Best Regards,

Jamsheed


On 9/22/2016 11:46 PM, Vladimir Ivanov wrote:
>> Ok. Hope there is no correctness reasons behind this?
>
> No, it simply doesn't communicate the intention clearly enough.
>
> All LambdaForms are marked w/ @Compiled, but we are only interested in 
> invokers (indy, exact & genertic invokers).

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160923/cf73af88/attachment-0001.html>

From goetz.lindenmaier at sap.com  Fri Sep 23 10:58:37 2016
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Fri, 23 Sep 2016 10:58:37 +0000
Subject: RFR(M): 8166562: C2: Suppress relocations in scratch emit.
Message-ID: <e90008ba6ba64b8fad5fd7203659b58e@DEWDFE13DE50.global.corp.sap>

Hi,

Please review this nice and small improvement to scratch emit. It simplifies
The s390 port considerably, but is completely independent. I introduced
usage of the feature on ppc.   I please need a sponsor.

http://cr.openjdk.java.net/~goetz/wr16/8166562-scratch_emit/webrev.01/

The C2 compiler needs to know how much space the assembly emitted for a MachNode requires. For many nodes, this is statically specified. Some nodes don't have fixed sizes, as the code emitted depends on flags or even runtime values. To determine the sizes of these, C2 does a scratch emit, i.e., it emits the assembly for the MachNode to a dedicated code buffer and remembers the space needed. In the debug build, this is done on each emit also for nodes with fixed size to verify the fixed size.

The scratch emit buffer does not support relocations. Therefore any code needing relocations must check for scratch emit and skip the relocations if so.

The s390x architecture offers a lot of instructions with pc-relative addressing. We use these to access constants in the constant section of the code buffer. As this section can be resized, these offsets must be able to be relocated. Instead of coding the check whether a scratch emit is happening into all the MachNodes, we mark the scratch emit buffers as such and just skip the relocation in these buffers. This simplifies usage of relocations in a lot of nodes and macroAssembler routines.

Best regards,
  Goetz.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160923/d434283b/attachment.html>

From doug.simon at oracle.com  Fri Sep 23 12:31:05 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Fri, 23 Sep 2016 14:31:05 +0200
Subject: RFR: 8166517: [JVMCI] export JVMCI to auto-detected JVMCI compiler
In-Reply-To: <C4C7CCED-4AF8-4C95-8846-40284551320A@oracle.com>
References: <C4C7CCED-4AF8-4C95-8846-40284551320A@oracle.com>
Message-ID: <17D28FE7-8E15-4D0C-89DE-7178E75A339D@oracle.com>

Can I please get a review of this tiny change.

Thanks!

-Doug

> On 22 Sep 2016, at 09:54, Doug Simon <doug.simon at oracle.com> wrote:
> 
> When JVMCI compiler auto-selection (JDK-8160730) is used, then JVMCI needs to be exported to the selected compiler that same as way if the -Djvmci.Compiler property was specified.
> 
> https://bugs.openjdk.java.net/browse/JDK-8166517
> http://cr.openjdk.java.net/~dnsimon/8166517/
> 
> -Doug


From vladimir.kozlov at oracle.com  Fri Sep 23 19:09:24 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 23 Sep 2016 12:09:24 -0700
Subject: RFR: 8166517: [JVMCI] export JVMCI to auto-detected JVMCI compiler
In-Reply-To: <17D28FE7-8E15-4D0C-89DE-7178E75A339D@oracle.com>
References: <C4C7CCED-4AF8-4C95-8846-40284551320A@oracle.com>
	<17D28FE7-8E15-4D0C-89DE-7178E75A339D@oracle.com>
Message-ID: <57E57DE4.3000305@oracle.com>

Looks fine to me. But I thought Chris or Tom could review it. They are both OpenJDK Reviewers.

Thanks,
Vladimir

On 9/23/16 5:31 AM, Doug Simon wrote:
> Can I please get a review of this tiny change.
>
> Thanks!
>
> -Doug
>
>> On 22 Sep 2016, at 09:54, Doug Simon <doug.simon at oracle.com> wrote:
>>
>> When JVMCI compiler auto-selection (JDK-8160730) is used, then JVMCI needs to be exported to the selected compiler that same as way if the -Djvmci.Compiler property was specified.
>>
>> https://bugs.openjdk.java.net/browse/JDK-8166517
>> http://cr.openjdk.java.net/~dnsimon/8166517/
>>
>> -Doug
>

From doug.simon at oracle.com  Fri Sep 23 20:19:00 2016
From: doug.simon at oracle.com (Doug Simon)
Date: Fri, 23 Sep 2016 22:19:00 +0200
Subject: RFR: 8166517: [JVMCI] export JVMCI to auto-detected JVMCI compiler
In-Reply-To: <57E57DE4.3000305@oracle.com>
References: <C4C7CCED-4AF8-4C95-8846-40284551320A@oracle.com>
	<17D28FE7-8E15-4D0C-89DE-7178E75A339D@oracle.com>
	<57E57DE4.3000305@oracle.com>
Message-ID: <720062BC-70D6-4592-AA39-9662398BBBC1@oracle.com>


> On 23 Sep 2016, at 21:09, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
> Looks fine to me. But I thought Chris or Tom could review it. They are both OpenJDK Reviewers.

I?m not fussy - anyone with the sufficient role with do ;-)

Thanks for the review in any case.

-Doug

> On 9/23/16 5:31 AM, Doug Simon wrote:
>> Can I please get a review of this tiny change.
>> 
>> Thanks!
>> 
>> -Doug
>> 
>>> On 22 Sep 2016, at 09:54, Doug Simon <doug.simon at oracle.com> wrote:
>>> 
>>> When JVMCI compiler auto-selection (JDK-8160730) is used, then JVMCI needs to be exported to the selected compiler that same as way if the -Djvmci.Compiler property was specified.
>>> 
>>> https://bugs.openjdk.java.net/browse/JDK-8166517
>>> http://cr.openjdk.java.net/~dnsimon/8166517/
>>> 
>>> -Doug
>> 


From kishor.kharbas at intel.com  Fri Sep 23 20:32:31 2016
From: kishor.kharbas at intel.com (Kharbas, Kishor)
Date: Fri, 23 Sep 2016 20:32:31 +0000
Subject: RFR(M) 8078122 : YMM registers upper 128 bits may get clobbered
	by a JNI call on windows
In-Reply-To: <57E30EF5.8010709@oracle.com>
References: <F89640DCD01A85489FCBA68183A6A0F38EB8A9A7@ORSMSX116.amr.corp.intel.com>
	<57BE1AD4.7070403@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F38EB8AC2E@ORSMSX116.amr.corp.intel.com>
	<F89640DCD01A85489FCBA68183A6A0F38EB91338@ORSMSX116.amr.corp.intel.com>
	<6aee0e7c-76a5-a920-7099-a3edc349f205@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F38EB92A1C@ORSMSX116.amr.corp.intel.com>
	<4af19c5d-9a7f-d18b-820b-6f3664b8183a@oracle.com>
	<7de8489c-943b-5ecf-48c1-0bffad101070@oracle.com>
	<F89640DCD01A85489FCBA68183A6A0F38EB957C3@ORSMSX116.amr.corp.intel.com>
	<F89640DCD01A85489FCBA68183A6A0F38EB96FE4@ORSMSX116.amr.corp.intel.com>
	<57E30EF5.8010709@oracle.com>
Message-ID: <F89640DCD01A85489FCBA68183A6A0F38EBB5CE7@ORSMSX116.amr.corp.intel.com>

Ah ok, glad to know it wasn't a regression by this patch.

Thanks
Kishor

-----Original Message-----
From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] 
Sent: Wednesday, September 21, 2016 3:52 PM
To: Kharbas, Kishor <kishor.kharbas at intel.com>; hotspot-compiler-dev at openjdk.java.net
Subject: Re: RFR(M) 8078122 : YMM registers upper 128 bits may get clobbered by a JNI call on windows

To close loop on this.

It looks like the machine, on which test failed, had well known XMM saving problem in Linux kernal.
So we decided to push changes. You may saw notification already.

regards,
Vladimir

On 9/8/16 5:46 PM, Kharbas, Kishor wrote:
> Hi Vladimir,
> I couldn't reproduce the error on my 32-bit Linux machine. The test 
> was done on a Sandy bridge machine (has AVX instruction set) Please advise how to proceed further.
>
> Thanks
> Kishor
>
>
> -----Original Message-----
> From: Kharbas, Kishor
> Sent: Tuesday, September 6, 2016 5:40 PM
> To: Vladimir Kozlov <vladimir.kozlov at oracle.com>; 
> hotspot-compiler-dev at openjdk.java.net
> Cc: Kharbas, Kishor <kishor.kharbas at intel.com>
> Subject: RE: RFR(M) 8078122 : YMM registers upper 128 bits may get 
> clobbered by a JNI call on windows
>
> Hi Vladimir,
>
> The patch only touches code in _WIN64. I am having hard time to 
> understand why the test fails for 32-bit Linux
>
> Btw, that test passes on Windows 64 platform. I am planning to test on Linux too.
>
> Thanks
> Kishor
>
> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Tuesday, September 6, 2016 2:31 PM
> To: Kharbas, Kishor <kishor.kharbas at intel.com>; 
> hotspot-compiler-dev at openjdk.java.net
> Subject: Re: RFR(M) 8078122 : YMM registers upper 128 bits may get 
> clobbered by a JNI call on windows
>
> Next jtreg test failed on 32-bit Linux:
>
> hotspot/test/compiler/runtime/Test7196199.java
>
> ----------System.err:(57/2416)----------
> test_incrc: [41] = 8.081506E20 != 150000.0
> test_incrc: [42] = 1.8632992E31 != 150000.0
> test_incrc: [43] = 2.8397877E29 != 150000.0 ...
>
> https://bugs.openjdk.java.net/browse/JDK-7196199
>
> was related to Upper bits (64-255) of XMM (YMM) registers are not saved/restored in interrupt handle code during safepoint.
>
> Looks like your changes are not enough.
>
> Vladimir
>
>
> On 9/6/16 10:12 AM, Vladimir Kozlov wrote:
>> Good. I start testing these changes. I will push it if testing pass.
>>
>> Thanks,
>> Vladimir
>>
>> On 9/2/16 3:07 PM, Kharbas, Kishor wrote:
>>> Thanks Vladimir,
>>>
>>> I have updated the patch :
>>> http://cr.openjdk.java.net/~vdeshpande/8078122/webrev.02/
>>>
>>> I looked for other places in src/cpu/x86/vm. I feel every case is 
>>> covered.
>>>
>>> - Kishor
>>>
>>> -----Original Message-----
>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>> Sent: Thursday, September 1, 2016 11:39 AM
>>> To: Kharbas, Kishor <kishor.kharbas at intel.com>; 
>>> hotspot-compiler-dev at openjdk.java.net
>>> Subject: Re: RFR(M) 8078122 : YMM registers upper 128 bits may get 
>>> clobbered by a JNI call on windows
>>>
>>> Good. But looks like some code relied on old stack layout in stubs, 
>>> for example sha256_AVX2():
>>>
>>> #ifndef _WIN64
>>>     _XMM_SAVE_SIZE = 0,
>>> #else
>>>     _XMM_SAVE_SIZE = 8*16,
>>> #endif
>>>
>>> Please, check that all other related code is fixed too. (I looked on 
>>> all cases of _WIN64 in src/cpu/x86/vm/).
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 8/31/16 10:17 PM, Kharbas, Kishor wrote:
>>>> Hello,
>>>>
>>>> I removed the unwanted save and restore of registers in the range
>>>> XMM6-XMM31 from the x64_64 stubs.
>>>> I also removed the #ifdef _WIN64 block from x86.ad file.
>>>>
>>>> Link to the new patch :
>>>> http://cr.openjdk.java.net/~vdeshpande/8078122/webrev.01/
>>>>
>>>> Thanks
>>>> Kishor
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Kharbas, Kishor
>>>> Sent: Wednesday, August 24, 2016 6:24 PM
>>>> To: Vladimir Kozlov <vladimir.kozlov at oracle.com>; 
>>>> hotspot-compiler-dev at openjdk.java.net
>>>> Cc: Kharbas, Kishor <kishor.kharbas at intel.com>
>>>> Subject: RE: RFR(M) 8078122 : YMM registers upper 128 bits may get 
>>>> clobbered by a JNI call on windows
>>>>
>>>> Thanks Vladimir for quick feedback.
>>>> I will look into the stubs which save the registers in the range 
>>>> XMM6-XMM31. Also the first comment makes perfect sense.
>>>>
>>>> Thanks
>>>> Kishor
>>>>
>>>> -----Original Message-----
>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>>> Sent: Wednesday, August 24, 2016 3:08 PM
>>>> To: Kharbas, Kishor <kishor.kharbas at intel.com>; 
>>>> hotspot-compiler-dev at openjdk.java.net
>>>> Subject: Re: RFR(M) 8078122 : YMM registers upper 128 bits may get 
>>>> clobbered by a JNI call on windows
>>>>
>>>> Hi Kishor,
>>>>
>>>> First, #ifdef _WIN64 is not needed anymore since calling convention 
>>>> is similat to unix now.
>>>>
>>>> Second, I would like you to look more broadly. With this change we 
>>>> don't need to preserve XMM6-XMM31 in our stubs for WIN64. I am not 
>>>> sure that we can remove all #ifdef _WIN64 there but for most of 
>>>> them I think we can do. Please, look.
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 8/24/16 2:40 PM, Kharbas, Kishor wrote:
>>>>> Requesting the community to review the patch for
>>>>> https://bugs.openjdk.java.net/browse/JDK-8078122
>>>>>
>>>>> Webrev : http://cr.openjdk.java.net/~vdeshpande/8078122/webrev.00
>>>>>
>>>>> The patch changes the definitions of registers XMM6-XMM31 for WIN64.
>>>>>
>>>>> Thank you.
>>>>>
>>>>> Kishor
>>>>>

From vladimir.kozlov at oracle.com  Fri Sep 23 21:25:52 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 23 Sep 2016 14:25:52 -0700
Subject: RFR(M): 8166562: C2: Suppress relocations in scratch emit.
In-Reply-To: <e90008ba6ba64b8fad5fd7203659b58e@DEWDFE13DE50.global.corp.sap>
References: <e90008ba6ba64b8fad5fd7203659b58e@DEWDFE13DE50.global.corp.sap>
Message-ID: <57E59DE0.8020200@oracle.com>

Looks good. I thought about using new type of CodeBlobType but it may need more changes then in your.

Thanks,
Vladimir

On 9/23/16 3:58 AM, Lindenmaier, Goetz wrote:
> Hi,
>
> Please review this nice and small improvement to scratch emit. It simplifies
>
> The s390 port considerably, but is completely independent. I introduced
>
> usage of the feature on ppc.   I please need a sponsor.
>
> http://cr.openjdk.java.net/~goetz/wr16/8166562-scratch_emit/webrev.01/
>
> The C2 compiler needs to know how much space the assembly emitted for a MachNode requires. For many nodes, this is statically specified. Some nodes don't have fixed sizes, as the code emitted depends
> on flags or even runtime values. To determine the sizes of these, C2 does a scratch emit, i.e., it emits the assembly for the MachNode to a dedicated code buffer and remembers the space needed. In the
> debug build, this is done on each emit also for nodes with fixed size to verify the fixed size.
>
> The scratch emit buffer does not support relocations. Therefore any code needing relocations must check for scratch emit and skip the relocations if so.
>
> The s390x architecture offers a lot of instructions with pc-relative addressing. We use these to access constants in the constant section of the code buffer. As this section can be resized, these
> offsets must be able to be relocated. Instead of coding the check whether a scratch emit is happening into all the MachNodes, we mark the scratch emit buffers as such and just skip the relocation in
> these buffers. This simplifies usage of relocations in a lot of nodes and macroAssembler routines.
>
> Best regards,
>
>    Goetz.
>

From martin.doerr at sap.com  Mon Sep 26 08:46:02 2016
From: martin.doerr at sap.com (Doerr, Martin)
Date: Mon, 26 Sep 2016 08:46:02 +0000
Subject: RFR(S): 8166689: PPC64: Race condition between stack bang and
	non-entrant patching
Message-ID: <1e90645ee3e14fee8ef0b7dc131157a2@DEWDFE13DE14.global.corp.sap>

Hi,

I found a race condition between stack bang and non-entrant patching on linux PPC64.

The signal handler on linux PPC64 investigates the instruction when a stack bang has hit the protected zone. Another thread may patch the verified entry point preventing the signal handler from recognizing the stack overflow.
This problem can be prevented by rearranging C1's prolog code such that the stack bang instruction will never be at the verified entry point. C2's prolog code is already implemented accordingly.

My proposed fix is here:
http://cr.openjdk.java.net/~mdoerr/8166689_PPC64_C1_stackbang/webrev.00/

I have also fixed a missing RewriteControl check in the template interpreter on PPC64.

Please review.

Thanks and best regards,
Martin

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160926/04559bf0/attachment.html>

From goetz.lindenmaier at sap.com  Mon Sep 26 09:16:21 2016
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Mon, 26 Sep 2016 09:16:21 +0000
Subject: RFR(S): 8166689: PPC64: Race condition between stack bang and
	non-entrant patching
In-Reply-To: <1e90645ee3e14fee8ef0b7dc131157a2@DEWDFE13DE14.global.corp.sap>
References: <1e90645ee3e14fee8ef0b7dc131157a2@DEWDFE13DE14.global.corp.sap>
Message-ID: <59f6c7b138df480680b2f3e47d80c986@DEWDFE13DE50.global.corp.sap>

Hi Martin,

Good catch!
I wondered whether CodeOffsets::Frame_Complete is still set properly,
but that's set in shared code after calling build_frame. So the fix is good.

Thanks also for fixing the better byte behavior issue.

Best regards,
  Goetz.

From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Doerr, Martin
Sent: Montag, 26. September 2016 10:46
To: hotspot-compiler-dev at openjdk.java.net
Subject: RFR(S): 8166689: PPC64: Race condition between stack bang and non-entrant patching

Hi,

I found a race condition between stack bang and non-entrant patching on linux PPC64.

The signal handler on linux PPC64 investigates the instruction when a stack bang has hit the protected zone. Another thread may patch the verified entry point preventing the signal handler from recognizing the stack overflow.
This problem can be prevented by rearranging C1's prolog code such that the stack bang instruction will never be at the verified entry point. C2's prolog code is already implemented accordingly.

My proposed fix is here:
http://cr.openjdk.java.net/~mdoerr/8166689_PPC64_C1_stackbang/webrev.00/

I have also fixed a missing RewriteControl check in the template interpreter on PPC64.

Please review.

Thanks and best regards,
Martin

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160926/da536ee2/attachment.html>

From martin.doerr at sap.com  Mon Sep 26 09:27:25 2016
From: martin.doerr at sap.com (Doerr, Martin)
Date: Mon, 26 Sep 2016 09:27:25 +0000
Subject: RFR(M): 8166140: C1: Possible integer overflow in
	LIRGenerator::generate_address on several platforms
In-Reply-To: <a11870a3-1de9-2d5b-0498-39eb7c0357df@oracle.com>
References: <29e2b45c984248da8172cf921b7811a6@DEWDFE13DE14.global.corp.sap>
	<66073428-8ee1-ecf2-52c0-8f4af5a6e6e8@oracle.com>
	<73f98e3882bd46dab427a02de68a1b93@DEWDFE13DE14.global.corp.sap>
	<a11870a3-1de9-2d5b-0498-39eb7c0357df@oracle.com>
Message-ID: <14cca179436e4d49ae94f44977af033d@DEWDFE13DE14.global.corp.sap>

Hi,

can somebody sponsor this C1 bug fix, please?
It has already one review.

Thanks and best regards,
Martin


-----Original Message-----
From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] 
Sent: Montag, 19. September 2016 19:10
To: Doerr, Martin <martin.doerr at sap.com>; hotspot-compiler-dev at openjdk.java.net
Subject: Re: RFR(M): 8166140: C1: Possible integer overflow in LIRGenerator::generate_address on several platforms

This looks good.

Thanks,
Vladimir

On 9/19/16 6:47 AM, Doerr, Martin wrote:
> Hi Vladimir,
>
> you're right. I have fixed that too in the new webrev:
> http://cr.openjdk.java.net/~mdoerr/8166140_C1_int_overflow/webrev.01/
>
> The 2 LIR_Address constructors you have mentioned don't have many users. The other ones look ok.
>
> Thanks and best regards,
> Martin
>
> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Donnerstag, 15. September 2016 20:06
> To: hotspot-compiler-dev at openjdk.java.net
> Cc: Doerr, Martin <martin.doerr at sap.com>
> Subject: Re: RFR(M): 8166140: C1: Possible integer overflow in LIRGenerator::generate_address on several platforms
>
> Good but is is not enough.
>
> emit_array_address() in c1_LIRGenerator_x86.cpp has the same problem.
> I would suggest to look on all places where next methods are called and make sure they are correct:
>
> LIR_Address(LIR_Opr base, intx disp, BasicType type)
> LIR_Address(LIR_Opr base, LIR_Opr index, Scale scale, intx disp, BasicType type)
>
> Thanks,
> Vladimir
>
> On 9/15/16 8:25 AM, Doerr, Martin wrote:
>> Hi,
>>
>>
>>
>> as discussed with Vladimir, C1 contains code to simplify constant index/displacement addressing which uses int. However,
>> int may overflow on 64 bit platforms.
>>
>>
>>
>> Please review the following webrev:
>>
>> http://cr.openjdk.java.net/~mdoerr/8166140_C1_int_overflow/webrev.00/
>>
>>
>>
>> I'll also need a sponsor, please.
>>
>>
>>
>> Thanks and best regards,
>>
>> Martin
>>
>>
>>

From vladimir.kozlov at oracle.com  Mon Sep 26 16:15:27 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 26 Sep 2016 09:15:27 -0700
Subject: RFR(M): 8166140: C1: Possible integer overflow in
	LIRGenerator::generate_address on several platforms
In-Reply-To: <14cca179436e4d49ae94f44977af033d@DEWDFE13DE14.global.corp.sap>
References: <29e2b45c984248da8172cf921b7811a6@DEWDFE13DE14.global.corp.sap>
	<66073428-8ee1-ecf2-52c0-8f4af5a6e6e8@oracle.com>
	<73f98e3882bd46dab427a02de68a1b93@DEWDFE13DE14.global.corp.sap>
	<a11870a3-1de9-2d5b-0498-39eb7c0357df@oracle.com>
	<14cca179436e4d49ae94f44977af033d@DEWDFE13DE14.global.corp.sap>
Message-ID: <57E9499F.3010908@oracle.com>

Sent to JPRT.

Vladimir

On 9/26/16 2:27 AM, Doerr, Martin wrote:
> Hi,
>
> can somebody sponsor this C1 bug fix, please?
> It has already one review.
>
> Thanks and best regards,
> Martin
>
>
> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Montag, 19. September 2016 19:10
> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-compiler-dev at openjdk.java.net
> Subject: Re: RFR(M): 8166140: C1: Possible integer overflow in LIRGenerator::generate_address on several platforms
>
> This looks good.
>
> Thanks,
> Vladimir
>
> On 9/19/16 6:47 AM, Doerr, Martin wrote:
>> Hi Vladimir,
>>
>> you're right. I have fixed that too in the new webrev:
>> http://cr.openjdk.java.net/~mdoerr/8166140_C1_int_overflow/webrev.01/
>>
>> The 2 LIR_Address constructors you have mentioned don't have many users. The other ones look ok.
>>
>> Thanks and best regards,
>> Martin
>>
>> -----Original Message-----
>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>> Sent: Donnerstag, 15. September 2016 20:06
>> To: hotspot-compiler-dev at openjdk.java.net
>> Cc: Doerr, Martin <martin.doerr at sap.com>
>> Subject: Re: RFR(M): 8166140: C1: Possible integer overflow in LIRGenerator::generate_address on several platforms
>>
>> Good but is is not enough.
>>
>> emit_array_address() in c1_LIRGenerator_x86.cpp has the same problem.
>> I would suggest to look on all places where next methods are called and make sure they are correct:
>>
>> LIR_Address(LIR_Opr base, intx disp, BasicType type)
>> LIR_Address(LIR_Opr base, LIR_Opr index, Scale scale, intx disp, BasicType type)
>>
>> Thanks,
>> Vladimir
>>
>> On 9/15/16 8:25 AM, Doerr, Martin wrote:
>>> Hi,
>>>
>>>
>>>
>>> as discussed with Vladimir, C1 contains code to simplify constant index/displacement addressing which uses int. However,
>>> int may overflow on 64 bit platforms.
>>>
>>>
>>>
>>> Please review the following webrev:
>>>
>>> http://cr.openjdk.java.net/~mdoerr/8166140_C1_int_overflow/webrev.00/
>>>
>>>
>>>
>>> I'll also need a sponsor, please.
>>>
>>>
>>>
>>> Thanks and best regards,
>>>
>>> Martin
>>>
>>>
>>>

From vitalyd at gmail.com  Mon Sep 26 17:23:01 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Mon, 26 Sep 2016 13:23:01 -0400
Subject: Odd inlining failure
Message-ID: <CAHjP37HtiaJpJyDF4BLbwKobvgs2q2=U3f7Lv+WpOPf3UTm=ng@mail.gmail.com>

Hi guys,

I'm trying to understand some "odd" inlining output from PrintInlining -
hoping someone can explain/confirm.

I have the following call graph:
a()
------> b()
--------------> c()

So a() calls b() (and some other methods that aren't relevant here).  b()
calls c() and d() internally.  a() gets hot, and is queued up for
compilation (C2, tiered is disabled).

b() is large (> MaxInlineSize) but less than FreqInlineSize - it gets
inlined with "inline (hot)" in the log.  c() is similar -- it's large, but
< FreqInlineSize.  However, the inlining output says "too big", and c()
isn't inlined.  Now, c() is *always* called when b() is called - it's a
helper method (ironically, contains code moved out of b() to make b()
smaller).  b() is also the only caller of c().

So, if b() is "hot", why is c() not? Is it because compilation, and
therefore inlining, started top-down here? CompileThreshold is the default
here - 10000.  Is it the case that b() reaches 10k, but c() is at 9999
still and is therefore not inlined?

Let me know if something's not clear in the above description.

Thanks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160926/c79ae72b/attachment-0001.html>

From cthalinger at twitter.com  Mon Sep 26 20:36:51 2016
From: cthalinger at twitter.com (Christian Thalinger)
Date: Mon, 26 Sep 2016 10:36:51 -1000
Subject: RFR: 8166517: [JVMCI] export JVMCI to auto-detected JVMCI compiler
In-Reply-To: <720062BC-70D6-4592-AA39-9662398BBBC1@oracle.com>
References: <C4C7CCED-4AF8-4C95-8846-40284551320A@oracle.com>
	<17D28FE7-8E15-4D0C-89DE-7178E75A339D@oracle.com>
	<57E57DE4.3000305@oracle.com>
	<720062BC-70D6-4592-AA39-9662398BBBC1@oracle.com>
Message-ID: <77F1D693-5893-48EB-9A77-DF8CBC092F0D@twitter.com>


> On Sep 23, 2016, at 10:19 AM, Doug Simon <doug.simon at oracle.com> wrote:
> 
> 
>> On 23 Sep 2016, at 21:09, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
>> 
>> Looks fine to me. But I thought Chris or Tom could review it. They are both OpenJDK Reviewers.

Sorry, I traveled.  Looks good, FTR.

> 
> I?m not fussy - anyone with the sufficient role with do ;-)
> 
> Thanks for the review in any case.
> 
> -Doug
> 
>> On 9/23/16 5:31 AM, Doug Simon wrote:
>>> Can I please get a review of this tiny change.
>>> 
>>> Thanks!
>>> 
>>> -Doug
>>> 
>>>> On 22 Sep 2016, at 09:54, Doug Simon <doug.simon at oracle.com> wrote:
>>>> 
>>>> When JVMCI compiler auto-selection (JDK-8160730) is used, then JVMCI needs to be exported to the selected compiler that same as way if the -Djvmci.Compiler property was specified.
>>>> 
>>>> https://bugs.openjdk.java.net/browse/JDK-8166517
>>>> http://cr.openjdk.java.net/~dnsimon/8166517/
>>>> 
>>>> -Doug
>>> 
> 


From rasbold at google.com  Mon Sep 26 22:18:31 2016
From: rasbold at google.com (Chuck Rasbold)
Date: Mon, 26 Sep 2016 15:18:31 -0700
Subject: RFR(S): 8166742 : SIGFPE in C2 Loop IV elimination
Message-ID: <CALFb4KuX4y0ftnww_+hnL2RurWnsyUNsm6_HhDKgCSYaNQhkcg@mail.gmail.com>

A small fix for an edge case crash in C2...

Bug: https://bugs.openjdk.java.net/browse/JDK-8166742
Webrev: http://cr.openjdk.java.net/~rasbold/8166742/webrev.00/

Requesting a sponsor and reviews.  Thanks.

-- Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160926/af2aeb35/attachment.html>

From vladimir.kozlov at oracle.com  Mon Sep 26 22:45:23 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 26 Sep 2016 15:45:23 -0700
Subject: RFR(S): 8166742 : SIGFPE in C2 Loop IV elimination
In-Reply-To: <CALFb4KuX4y0ftnww_+hnL2RurWnsyUNsm6_HhDKgCSYaNQhkcg@mail.gmail.com>
References: <CALFb4KuX4y0ftnww_+hnL2RurWnsyUNsm6_HhDKgCSYaNQhkcg@mail.gmail.com>
Message-ID: <57E9A503.6090506@oracle.com>

Hi Chuck

Can you do 'long' arithmetic in existing condition to catch integer overflow instead?

if ((ratio_con * stride_con) == stride_con2) { // Check for exact

thanks,
Vladimir


On 9/26/16 3:18 PM, Chuck Rasbold wrote:
> A small fix for an edge case crash in C2...
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8166742
> Webrev: http://cr.openjdk.java.net/~rasbold/8166742/webrev.00/
>
> Requesting a sponsor and reviews.  Thanks.
>
> -- Chuck

From rasbold at google.com  Tue Sep 27 00:01:14 2016
From: rasbold at google.com (Chuck Rasbold)
Date: Mon, 26 Sep 2016 17:01:14 -0700
Subject: RFR(S): 8166742 : SIGFPE in C2 Loop IV elimination
In-Reply-To: <57E9A503.6090506@oracle.com>
References: <CALFb4KuX4y0ftnww_+hnL2RurWnsyUNsm6_HhDKgCSYaNQhkcg@mail.gmail.com>
	<57E9A503.6090506@oracle.com>
Message-ID: <CALFb4Ktyxt--39D-DQ=W+20n26t4JOyrvGEJ6q-+ZO77K2vPqg@mail.gmail.com>

Just to confirm, are you suggesting that the ratio be first computed as a
64 bit quantity, effectively along the lines of...

  long ratio_conl = ((long) stride_con2) / stride_con;

  if ((ratio_conl * stride_con) == stride_con2 &&
      ratio_conl < 0x8000000 ) { // Check for exact
    int ratio_con = (int) ratio_conl;


On Mon, Sep 26, 2016 at 3:45 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com
> wrote:

> Hi Chuck
>
> Can you do 'long' arithmetic in existing condition to catch integer
> overflow instead?
>
> if ((ratio_con * stride_con) == stride_con2) { // Check for exact
>
> thanks,
> Vladimir
>
>
>
> On 9/26/16 3:18 PM, Chuck Rasbold wrote:
>
>> A small fix for an edge case crash in C2...
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8166742
>> Webrev: http://cr.openjdk.java.net/~rasbold/8166742/webrev.00/
>>
>> Requesting a sponsor and reviews.  Thanks.
>>
>> -- Chuck
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160926/25aa6e22/attachment.html>

From vladimir.kozlov at oracle.com  Tue Sep 27 00:35:17 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 26 Sep 2016 17:35:17 -0700
Subject: RFR(S): 8166742 : SIGFPE in C2 Loop IV elimination
In-Reply-To: <CALFb4Ktyxt--39D-DQ=W+20n26t4JOyrvGEJ6q-+ZO77K2vPqg@mail.gmail.com>
References: <CALFb4KuX4y0ftnww_+hnL2RurWnsyUNsm6_HhDKgCSYaNQhkcg@mail.gmail.com>
	<57E9A503.6090506@oracle.com>
	<CALFb4Ktyxt--39D-DQ=W+20n26t4JOyrvGEJ6q-+ZO77K2vPqg@mail.gmail.com>
Message-ID: <57E9BEC5.2060308@oracle.com>

Slightly different (cast after /) and jlong type:

   jlong ratio_conl = (jlong) (stride_con2 / stride_con);

   if ((ratio_conl * stride_con) == (jlong)stride_con2) { // Check for exact

Vladimir

On 9/26/16 5:01 PM, Chuck Rasbold wrote:
> Just to confirm, are you suggesting that the ratio be first computed as a 64 bit quantity, effectively along the lines of...
>
>    long ratio_conl = ((long) stride_con2) / stride_con;
>
>    if ((ratio_conl * stride_con) == stride_con2 &&
>        ratio_conl < 0x8000000 ) { // Check for exact
>      int ratio_con = (int) ratio_conl;
>
>
> On Mon, Sep 26, 2016 at 3:45 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>> wrote:
>
>     Hi Chuck
>
>     Can you do 'long' arithmetic in existing condition to catch integer overflow instead?
>
>     if ((ratio_con * stride_con) == stride_con2) { // Check for exact
>
>     thanks,
>     Vladimir
>
>
>
>     On 9/26/16 3:18 PM, Chuck Rasbold wrote:
>
>         A small fix for an edge case crash in C2...
>
>         Bug: https://bugs.openjdk.java.net/browse/JDK-8166742 <https://bugs.openjdk.java.net/browse/JDK-8166742>
>         Webrev: http://cr.openjdk.java.net/~rasbold/8166742/webrev.00/ <http://cr.openjdk.java.net/~rasbold/8166742/webrev.00/>
>
>         Requesting a sponsor and reviews.  Thanks.
>
>         -- Chuck
>
>

From martin.doerr at sap.com  Tue Sep 27 08:19:01 2016
From: martin.doerr at sap.com (Doerr, Martin)
Date: Tue, 27 Sep 2016 08:19:01 +0000
Subject: RFR(M): 8166140: C1: Possible integer overflow in
	LIRGenerator::generate_address on several platforms
In-Reply-To: <57E9499F.3010908@oracle.com>
References: <29e2b45c984248da8172cf921b7811a6@DEWDFE13DE14.global.corp.sap>
	<66073428-8ee1-ecf2-52c0-8f4af5a6e6e8@oracle.com>
	<73f98e3882bd46dab427a02de68a1b93@DEWDFE13DE14.global.corp.sap>
	<a11870a3-1de9-2d5b-0498-39eb7c0357df@oracle.com>
	<14cca179436e4d49ae94f44977af033d@DEWDFE13DE14.global.corp.sap>
	<57E9499F.3010908@oracle.com>
Message-ID: <563ea24df5e9458092264113e8989bbf@DEWDFE13DE14.global.corp.sap>

Hi Vladimir,

thank you very much for your support.

Best regards,
Martin

-----Original Message-----
From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] 
Sent: Montag, 26. September 2016 18:15
To: Doerr, Martin <martin.doerr at sap.com>; hotspot-compiler-dev at openjdk.java.net
Subject: Re: RFR(M): 8166140: C1: Possible integer overflow in LIRGenerator::generate_address on several platforms

Sent to JPRT.

Vladimir

On 9/26/16 2:27 AM, Doerr, Martin wrote:
> Hi,
>
> can somebody sponsor this C1 bug fix, please?
> It has already one review.
>
> Thanks and best regards,
> Martin
>
>
> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Montag, 19. September 2016 19:10
> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-compiler-dev at openjdk.java.net
> Subject: Re: RFR(M): 8166140: C1: Possible integer overflow in LIRGenerator::generate_address on several platforms
>
> This looks good.
>
> Thanks,
> Vladimir
>
> On 9/19/16 6:47 AM, Doerr, Martin wrote:
>> Hi Vladimir,
>>
>> you're right. I have fixed that too in the new webrev:
>> http://cr.openjdk.java.net/~mdoerr/8166140_C1_int_overflow/webrev.01/
>>
>> The 2 LIR_Address constructors you have mentioned don't have many users. The other ones look ok.
>>
>> Thanks and best regards,
>> Martin
>>
>> -----Original Message-----
>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>> Sent: Donnerstag, 15. September 2016 20:06
>> To: hotspot-compiler-dev at openjdk.java.net
>> Cc: Doerr, Martin <martin.doerr at sap.com>
>> Subject: Re: RFR(M): 8166140: C1: Possible integer overflow in LIRGenerator::generate_address on several platforms
>>
>> Good but is is not enough.
>>
>> emit_array_address() in c1_LIRGenerator_x86.cpp has the same problem.
>> I would suggest to look on all places where next methods are called and make sure they are correct:
>>
>> LIR_Address(LIR_Opr base, intx disp, BasicType type)
>> LIR_Address(LIR_Opr base, LIR_Opr index, Scale scale, intx disp, BasicType type)
>>
>> Thanks,
>> Vladimir
>>
>> On 9/15/16 8:25 AM, Doerr, Martin wrote:
>>> Hi,
>>>
>>>
>>>
>>> as discussed with Vladimir, C1 contains code to simplify constant index/displacement addressing which uses int. However,
>>> int may overflow on 64 bit platforms.
>>>
>>>
>>>
>>> Please review the following webrev:
>>>
>>> http://cr.openjdk.java.net/~mdoerr/8166140_C1_int_overflow/webrev.00/
>>>
>>>
>>>
>>> I'll also need a sponsor, please.
>>>
>>>
>>>
>>> Thanks and best regards,
>>>
>>> Martin
>>>
>>>
>>>

From martin.doerr at sap.com  Tue Sep 27 09:03:12 2016
From: martin.doerr at sap.com (Doerr, Martin)
Date: Tue, 27 Sep 2016 09:03:12 +0000
Subject: RFR(XXS): 8166767: C2: OptimizeStringConcat produces wrong results
	when copying UTF16 Strings
Message-ID: <6359416b37c7459594386d16d48d0644@DEWDFE13DE14.global.corp.sap>

Hi,

can somebody review and sponsor this very small C2 bug fix, please?
PhaseStringOpts::copy_constant_string increments index twice in the copy loop when source and destination are UTF16 encoded.

http://cr.openjdk.java.net/~mdoerr/8166767_StringOpts_copy_bug/webrev.00/

Thanks and best regards,
Martin

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160927/6762087a/attachment-0001.html>

From tobias.hartmann at oracle.com  Tue Sep 27 09:30:41 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Tue, 27 Sep 2016 11:30:41 +0200
Subject: RFR(XXS): 8166767: C2: OptimizeStringConcat produces wrong
	results when copying UTF16 Strings
In-Reply-To: <6359416b37c7459594386d16d48d0644@DEWDFE13DE14.global.corp.sap>
References: <6359416b37c7459594386d16d48d0644@DEWDFE13DE14.global.corp.sap>
Message-ID: <57EA3C41.5050206@oracle.com>

Hi Martin,

On 27.09.2016 11:03, Doerr, Martin wrote:
> PhaseStringOpts::copy_constant_string increments index twice in the copy loop when source and destination are UTF16 encoded.
> http://cr.openjdk.java.net/~mdoerr/8166767_StringOpts_copy_bug/webrev.00/ <http://cr.openjdk.java.net/%7Emdoerr/8166767_StringOpts_copy_bug/webrev.00/>

The index passed to readChar refers to an index in the source byte array (and length is the size in bytes):
 // Read two bytes from index and index+1 and convert them to a char

For example, if we want to read the second char value, we need to use index = 4. Therefore, if we read chars, we need to increment i twice in each loop iteration to get the correct char index in the byte array.

Or am I missing something?

Best regards,
Tobias

From martin.doerr at sap.com  Tue Sep 27 09:40:59 2016
From: martin.doerr at sap.com (Doerr, Martin)
Date: Tue, 27 Sep 2016 09:40:59 +0000
Subject: RFR(XXS): 8166767: C2: OptimizeStringConcat produces wrong
	results when copying UTF16 Strings
In-Reply-To: <57EA3C41.5050206@oracle.com>
References: <6359416b37c7459594386d16d48d0644@DEWDFE13DE14.global.corp.sap>
	<57EA3C41.5050206@oracle.com>
Message-ID: <0d0271854c0d4792ac0b2cc0180a83d5@DEWDFE13DE14.global.corp.sap>

Hi Tobias,

thank you very much for reviewing. You're right.
I just noticed that the code didn't fit to another change which I have locally. I got to fix that.
Sorry for the mistake. I'll close the bug.

Best regards,
Martin


-----Original Message-----
From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com] 
Sent: Dienstag, 27. September 2016 11:31
To: Doerr, Martin <martin.doerr at sap.com>; hotspot-compiler-dev at openjdk.java.net
Subject: Re: RFR(XXS): 8166767: C2: OptimizeStringConcat produces wrong results when copying UTF16 Strings

Hi Martin,

On 27.09.2016 11:03, Doerr, Martin wrote:
> PhaseStringOpts::copy_constant_string increments index twice in the copy loop when source and destination are UTF16 encoded.
> http://cr.openjdk.java.net/~mdoerr/8166767_StringOpts_copy_bug/webrev.00/ <http://cr.openjdk.java.net/%7Emdoerr/8166767_StringOpts_copy_bug/webrev.00/>

The index passed to readChar refers to an index in the source byte array (and length is the size in bytes):
 // Read two bytes from index and index+1 and convert them to a char

For example, if we want to read the second char value, we need to use index = 4. Therefore, if we read chars, we need to increment i twice in each loop iteration to get the correct char index in the byte array.

Or am I missing something?

Best regards,
Tobias

From tobias.hartmann at oracle.com  Tue Sep 27 10:19:46 2016
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Tue, 27 Sep 2016 12:19:46 +0200
Subject: RFR(XXS): 8166767: C2: OptimizeStringConcat produces wrong
	results when copying UTF16 Strings
In-Reply-To: <0d0271854c0d4792ac0b2cc0180a83d5@DEWDFE13DE14.global.corp.sap>
References: <6359416b37c7459594386d16d48d0644@DEWDFE13DE14.global.corp.sap>
	<57EA3C41.5050206@oracle.com>
	<0d0271854c0d4792ac0b2cc0180a83d5@DEWDFE13DE14.global.corp.sap>
Message-ID: <57EA47C2.4000404@oracle.com>

Hi Martin,

On 27.09.2016 11:40, Doerr, Martin wrote:
> Hi Tobias,
> 
> thank you very much for reviewing. You're right.
> I just noticed that the code didn't fit to another change which I have locally. I got to fix that.
> Sorry for the mistake. I'll close the bug.

Sure, no problem!

Best regards,
Tobias

> Best regards,
> Martin
> 
> 
> -----Original Message-----
> From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com] 
> Sent: Dienstag, 27. September 2016 11:31
> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-compiler-dev at openjdk.java.net
> Subject: Re: RFR(XXS): 8166767: C2: OptimizeStringConcat produces wrong results when copying UTF16 Strings
> 
> Hi Martin,
> 
> On 27.09.2016 11:03, Doerr, Martin wrote:
>> PhaseStringOpts::copy_constant_string increments index twice in the copy loop when source and destination are UTF16 encoded.
>> http://cr.openjdk.java.net/~mdoerr/8166767_StringOpts_copy_bug/webrev.00/ <http://cr.openjdk.java.net/%7Emdoerr/8166767_StringOpts_copy_bug/webrev.00/>
> 
> The index passed to readChar refers to an index in the source byte array (and length is the size in bytes):
>  // Read two bytes from index and index+1 and convert them to a char
> 
> For example, if we want to read the second char value, we need to use index = 4. Therefore, if we read chars, we need to increment i twice in each loop iteration to get the correct char index in the byte array.
> 
> Or am I missing something?
> 
> Best regards,
> Tobias
> 

From roland.schatz at oracle.com  Tue Sep 27 12:54:56 2016
From: roland.schatz at oracle.com (Roland Schatz)
Date: Tue, 27 Sep 2016 14:54:56 +0200
Subject: RFR: 8166781: fix wrong comment in ReceiverTypeData
Message-ID: <6e26665d-b687-4ef8-619f-b398eca63b2a@oracle.com>

Hi,

Please review this comment fix:

webrev: http://cr.openjdk.java.net/~rschatz/JDK-8166781/webrev.00/
issue: https://bugs.openjdk.java.net/browse/JDK-8166781

According to my reading of the code, the comment should now agree with 
the code.
But I don't pretend to really understand that code. It would be nice if 
someone who knows about the profiling code could confirm that's actually 
true ;)

See also previous thread about that issue:
http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-August/024105.html

Thanks,
Roland


From rasbold at google.com  Tue Sep 27 14:56:27 2016
From: rasbold at google.com (Chuck Rasbold)
Date: Tue, 27 Sep 2016 07:56:27 -0700
Subject: RFR(S): 8166742 : SIGFPE in C2 Loop IV elimination
In-Reply-To: <57E9BEC5.2060308@oracle.com>
References: <CALFb4KuX4y0ftnww_+hnL2RurWnsyUNsm6_HhDKgCSYaNQhkcg@mail.gmail.com>
	<57E9A503.6090506@oracle.com>
	<CALFb4Ktyxt--39D-DQ=W+20n26t4JOyrvGEJ6q-+ZO77K2vPqg@mail.gmail.com>
	<57E9BEC5.2060308@oracle.com>
Message-ID: <CALFb4KvUEtckd0kp_rKNGzJF7TVK7vnndws2WdEx6d=Pj6WkLw@mail.gmail.com>

On Mon, Sep 26, 2016 at 5:35 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com
> wrote:

> Slightly different (cast after /) and jlong type:
>
>   jlong ratio_conl = (jlong) (stride_con2 / stride_con);
>

The division above won't work (at least, it raises a SIGFPE on my Linux x86
platform) when stride_con2 == min_jint   and stride_con == -1.


>   if ((ratio_conl * stride_con) == (jlong)stride_con2) { // Check for exact
>
>
What would be the value of ratio_conl such that this test fails?  I think
I'm missing something...

-- Chuck


> Vladimir
>
> On 9/26/16 5:01 PM, Chuck Rasbold wrote:
>
>> Just to confirm, are you suggesting that the ratio be first computed as a
>> 64 bit quantity, effectively along the lines of...
>>
>>    long ratio_conl = ((long) stride_con2) / stride_con;
>>
>>    if ((ratio_conl * stride_con) == stride_con2 &&
>>        ratio_conl < 0x8000000 ) { // Check for exact
>>      int ratio_con = (int) ratio_conl;
>>
>>
>> On Mon, Sep 26, 2016 at 3:45 PM, Vladimir Kozlov <
>> vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>> wrote:
>>
>>     Hi Chuck
>>
>>     Can you do 'long' arithmetic in existing condition to catch integer
>> overflow instead?
>>
>>     if ((ratio_con * stride_con) == stride_con2) { // Check for exact
>>
>>     thanks,
>>     Vladimir
>>
>>
>>
>>     On 9/26/16 3:18 PM, Chuck Rasbold wrote:
>>
>>         A small fix for an edge case crash in C2...
>>
>>         Bug: https://bugs.openjdk.java.net/browse/JDK-8166742 <
>> https://bugs.openjdk.java.net/browse/JDK-8166742>
>>         Webrev: http://cr.openjdk.java.net/~rasbold/8166742/webrev.00/ <
>> http://cr.openjdk.java.net/~rasbold/8166742/webrev.00/>
>>
>>         Requesting a sponsor and reviews.  Thanks.
>>
>>         -- Chuck
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160927/696ceeff/attachment.html>

From vladimir.kozlov at oracle.com  Tue Sep 27 16:48:44 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 27 Sep 2016 09:48:44 -0700
Subject: RFR(S): 8166742 : SIGFPE in C2 Loop IV elimination
In-Reply-To: <CALFb4KvUEtckd0kp_rKNGzJF7TVK7vnndws2WdEx6d=Pj6WkLw@mail.gmail.com>
References: <CALFb4KuX4y0ftnww_+hnL2RurWnsyUNsm6_HhDKgCSYaNQhkcg@mail.gmail.com>
	<57E9A503.6090506@oracle.com>
	<CALFb4Ktyxt--39D-DQ=W+20n26t4JOyrvGEJ6q-+ZO77K2vPqg@mail.gmail.com>
	<57E9BEC5.2060308@oracle.com>
	<CALFb4KvUEtckd0kp_rKNGzJF7TVK7vnndws2WdEx6d=Pj6WkLw@mail.gmail.com>
Message-ID: <274bdc7a-e7e5-882c-5715-32063f1e0f2c@oracle.com>

So why it is SIGFPE when both values are 'int'?

I thought it is incorrect results cause SIGFPE that is why I suggested 
to check for integer overflow.

Lets then go with your second suggested change here. But let check that 
ratio is small first and do cast to (jint) otherwise the long check is 
useless:

   // The ratio of the two strides cannot be represented as an int
   // if stride_con2 is min_int and stride_con is -1.
   jlong ratio_conl = ((jlong)stride_con2 / stride_con);

   if ((ratio_conl < 0x80000000L) &&
       (jint)(ratio_conl * stride_con) == stride_con2) { // Check for exact
      jint ratio_con = (jint)ratio_conl;

Thanks,
Vladimir

On 9/27/16 7:56 AM, Chuck Rasbold wrote:
>
>
> On Mon, Sep 26, 2016 at 5:35 PM, Vladimir Kozlov
> <vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>> wrote:
>
>     Slightly different (cast after /) and jlong type:
>
>       jlong ratio_conl = (jlong) (stride_con2 / stride_con);
>
>
> The division above won't work (at least, it raises a SIGFPE on my Linux
> x86 platform) when stride_con2 == min_jint   and stride_con == -1.
>
>
>       if ((ratio_conl * stride_con) == (jlong)stride_con2) { // Check
>     for exact
>
>
> What would be the value of ratio_conl such that this test fails?  I
> think I'm missing something...
>
> -- Chuck
>
>
>     Vladimir
>
>     On 9/26/16 5:01 PM, Chuck Rasbold wrote:
>
>         Just to confirm, are you suggesting that the ratio be first
>         computed as a 64 bit quantity, effectively along the lines of...
>
>            long ratio_conl = ((long) stride_con2) / stride_con;
>
>            if ((ratio_conl * stride_con) == stride_con2 &&
>                ratio_conl < 0x8000000 ) { // Check for exact
>              int ratio_con = (int) ratio_conl;
>
>
>         On Mon, Sep 26, 2016 at 3:45 PM, Vladimir Kozlov
>         <vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>
>         <mailto:vladimir.kozlov at oracle.com
>         <mailto:vladimir.kozlov at oracle.com>>> wrote:
>
>             Hi Chuck
>
>             Can you do 'long' arithmetic in existing condition to catch
>         integer overflow instead?
>
>             if ((ratio_con * stride_con) == stride_con2) { // Check for
>         exact
>
>             thanks,
>             Vladimir
>
>
>
>             On 9/26/16 3:18 PM, Chuck Rasbold wrote:
>
>                 A small fix for an edge case crash in C2...
>
>                 Bug: https://bugs.openjdk.java.net/browse/JDK-8166742
>         <https://bugs.openjdk.java.net/browse/JDK-8166742>
>         <https://bugs.openjdk.java.net/browse/JDK-8166742
>         <https://bugs.openjdk.java.net/browse/JDK-8166742>>
>                 Webrev:
>         http://cr.openjdk.java.net/~rasbold/8166742/webrev.00/
>         <http://cr.openjdk.java.net/~rasbold/8166742/webrev.00/>
>         <http://cr.openjdk.java.net/~rasbold/8166742/webrev.00/
>         <http://cr.openjdk.java.net/~rasbold/8166742/webrev.00/>>
>
>                 Requesting a sponsor and reviews.  Thanks.
>
>                 -- Chuck
>
>
>

From rasbold at google.com  Tue Sep 27 20:57:35 2016
From: rasbold at google.com (Chuck Rasbold)
Date: Tue, 27 Sep 2016 13:57:35 -0700
Subject: RFR(S): 8166742 : SIGFPE in C2 Loop IV elimination
In-Reply-To: <274bdc7a-e7e5-882c-5715-32063f1e0f2c@oracle.com>
References: <CALFb4KuX4y0ftnww_+hnL2RurWnsyUNsm6_HhDKgCSYaNQhkcg@mail.gmail.com>
	<57E9A503.6090506@oracle.com>
	<CALFb4Ktyxt--39D-DQ=W+20n26t4JOyrvGEJ6q-+ZO77K2vPqg@mail.gmail.com>
	<57E9BEC5.2060308@oracle.com>
	<CALFb4KvUEtckd0kp_rKNGzJF7TVK7vnndws2WdEx6d=Pj6WkLw@mail.gmail.com>
	<274bdc7a-e7e5-882c-5715-32063f1e0f2c@oracle.com>
Message-ID: <CALFb4Kt1B6TPiCsiGd-R3OTL+6DTZVvEFwgp4cjvq4Q-fHLjJg@mail.gmail.com>

Sorry for not being transparent enough.  Here's an external reference that
describes the problem
that is being encountered by the division:

https://www.gnu.org/software/autoconf/manual/autoconf-2.67/html_node/Signed-Integer-Division.html

That's why the original fix targeted a very specific case.  One can't
represent ratio_con as a 32 bit value in that case.
Worse, trying to compute it by division causes a SIGFPE.

Do you think the revised code below is as straightforward as the original?

-- Chuck

On Tue, Sep 27, 2016 at 9:48 AM, Vladimir Kozlov <vladimir.kozlov at oracle.com
> wrote:

> So why it is SIGFPE when both values are 'int'?
>
> I thought it is incorrect results cause SIGFPE that is why I suggested to
> check for integer overflow.
>
> Lets then go with your second suggested change here. But let check that
> ratio is small first and do cast to (jint) otherwise the long check is
> useless:
>
>   // The ratio of the two strides cannot be represented as an int
>   // if stride_con2 is min_int and stride_con is -1.
>   jlong ratio_conl = ((jlong)stride_con2 / stride_con);
>
>   if ((ratio_conl < 0x80000000L) &&
>       (jint)(ratio_conl * stride_con) == stride_con2) { // Check for exact
>      jint ratio_con = (jint)ratio_conl;
>
> Thanks,
> Vladimir
>
> On 9/27/16 7:56 AM, Chuck Rasbold wrote:
>
>>
>>
>> On Mon, Sep 26, 2016 at 5:35 PM, Vladimir Kozlov
>> <vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>> wrote:
>>
>>     Slightly different (cast after /) and jlong type:
>>
>>       jlong ratio_conl = (jlong) (stride_con2 / stride_con);
>>
>>
>> The division above won't work (at least, it raises a SIGFPE on my Linux
>> x86 platform) when stride_con2 == min_jint   and stride_con == -1.
>>
>>
>>       if ((ratio_conl * stride_con) == (jlong)stride_con2) { // Check
>>     for exact
>>
>>
>> What would be the value of ratio_conl such that this test fails?  I
>> think I'm missing something...
>>
>> -- Chuck
>>
>>
>>     Vladimir
>>
>>     On 9/26/16 5:01 PM, Chuck Rasbold wrote:
>>
>>         Just to confirm, are you suggesting that the ratio be first
>>         computed as a 64 bit quantity, effectively along the lines of...
>>
>>            long ratio_conl = ((long) stride_con2) / stride_con;
>>
>>            if ((ratio_conl * stride_con) == stride_con2 &&
>>                ratio_conl < 0x8000000 ) { // Check for exact
>>              int ratio_con = (int) ratio_conl;
>>
>>
>>         On Mon, Sep 26, 2016 at 3:45 PM, Vladimir Kozlov
>>         <vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>
>>         <mailto:vladimir.kozlov at oracle.com
>>         <mailto:vladimir.kozlov at oracle.com>>> wrote:
>>
>>             Hi Chuck
>>
>>             Can you do 'long' arithmetic in existing condition to catch
>>         integer overflow instead?
>>
>>             if ((ratio_con * stride_con) == stride_con2) { // Check for
>>         exact
>>
>>             thanks,
>>             Vladimir
>>
>>
>>
>>             On 9/26/16 3:18 PM, Chuck Rasbold wrote:
>>
>>                 A small fix for an edge case crash in C2...
>>
>>                 Bug: https://bugs.openjdk.java.net/browse/JDK-8166742
>>         <https://bugs.openjdk.java.net/browse/JDK-8166742>
>>         <https://bugs.openjdk.java.net/browse/JDK-8166742
>>         <https://bugs.openjdk.java.net/browse/JDK-8166742>>
>>                 Webrev:
>>         http://cr.openjdk.java.net/~rasbold/8166742/webrev.00/
>>         <http://cr.openjdk.java.net/~rasbold/8166742/webrev.00/>
>>         <http://cr.openjdk.java.net/~rasbold/8166742/webrev.00/
>>         <http://cr.openjdk.java.net/~rasbold/8166742/webrev.00/>>
>>
>>                 Requesting a sponsor and reviews.  Thanks.
>>
>>                 -- Chuck
>>
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160927/957e3f75/attachment.html>

From markus.gronlund at oracle.com  Tue Sep 27 20:58:46 2016
From: markus.gronlund at oracle.com (Markus Gronlund)
Date: Tue, 27 Sep 2016 13:58:46 -0700 (PDT)
Subject: RFR(XS): 8166806: Add intrinsic support for writer used in event
	based tracing
Message-ID: <da213d10-a154-4d84-99e8-2cf030ec51fa@default>

Greetings,

 
Kindly asking for reviews for the following change:

 
Bug: http://bugs.openjdk.java.net/browse/JDK-8166806 

Webrev: http://cr.openjdk.java.net/~mgronlun/8166806/webrev/

 
Thanks in advance

Markus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160927/4b1a0bd0/attachment.html>

From rwestrel at redhat.com  Wed Sep 28 09:30:03 2016
From: rwestrel at redhat.com (Roland Westrelin)
Date: Wed, 28 Sep 2016 11:30:03 +0200
Subject: RFR(S): 8166836: Elimination of clone's ArrayCopyNode may make
	compilation fail silently
Message-ID: <dk6eg448j8k.fsf@rwestrel.remote.csb>


http://cr.openjdk.java.net/~roland/8166836/webrev.00/

For a non escaping allocation, the logic that eliminates a clone's
ArrayCopy node, adds loads for each field of the eliminated allocation
at safepoints. For each load, the load's control is set so the load is
after the memory barrier that always precedes the ArrayCopy node but the
memory edge is set to the memory state before the memory barrier. Anti
dependency edges are added to the load nodes resulting in a graph that
can't be scheduled and a compilation that always fail.

I think the memory edges of the loads bypass the memory barrier so the
loads have a chance to be optimized out (if for instance the loads are
from a just allocated object). But that code doesn't seem to even work
in simple cases. Instead, I propose we eliminate the memory barrier
before the ArrayCopy node (and the one after).

It's quite unfortunate that this wasn't found by testing because
compilations where the graph is non schedulable simply fail. This could
have gone unnoticed much longer. In debug builds shouldn't we abort the
VM in C2Compiler::compile_method() if the compilation fails because of a
non schedulable graph?

Roland.

From vladimir.x.ivanov at oracle.com  Wed Sep 28 10:04:39 2016
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Wed, 28 Sep 2016 13:04:39 +0300
Subject: RFR(S): 8166836: Elimination of clone's ArrayCopyNode may make
	compilation fail silently
In-Reply-To: <dk6eg448j8k.fsf@rwestrel.remote.csb>
References: <dk6eg448j8k.fsf@rwestrel.remote.csb>
Message-ID: <182f42a0-caca-ea5b-0e7d-3ebaff5f1bc1@oracle.com>

> http://cr.openjdk.java.net/~roland/8166836/webrev.00/

Looks good.

> It's quite unfortunate that this wasn't found by testing because
> compilations where the graph is non schedulable simply fail. This could
> have gone unnoticed much longer. In debug builds shouldn't we abort the
> VM in C2Compiler::compile_method() if the compilation fails because of a
> non schedulable graph?

Sounds reasonable. I expect there are other cases when compilers bail 
out unexpectely. It would be good to have an assert checking it doesn't 
happen.

Best regards,
Vladimir Ivanov

From rwestrel at redhat.com  Wed Sep 28 13:03:40 2016
From: rwestrel at redhat.com (Roland Westrelin)
Date: Wed, 28 Sep 2016 15:03:40 +0200
Subject: Odd inlining failure
In-Reply-To: <CAHjP37HtiaJpJyDF4BLbwKobvgs2q2=U3f7Lv+WpOPf3UTm=ng@mail.gmail.com>
References: <CAHjP37HtiaJpJyDF4BLbwKobvgs2q2=U3f7Lv+WpOPf3UTm=ng@mail.gmail.com>
Message-ID: <dk6bmz889cj.fsf@rwestrel.remote.csb>


Hi Vitaly,

> I'm trying to understand some "odd" inlining output from PrintInlining -
> hoping someone can explain/confirm.

You could run with -XX:+PrintMethodData (diagnostic). It prints all
profile data at the end of the execution of the VM. You can then look at
invocation counts at the call site of c in b and b in a.

> I have the following call graph:
> a()
> ------> b()
> --------------> c()
>
> So a() calls b() (and some other methods that aren't relevant here).  b()
> calls c() and d() internally.  a() gets hot, and is queued up for
> compilation (C2, tiered is disabled).
>
> b() is large (> MaxInlineSize) but less than FreqInlineSize - it gets
> inlined with "inline (hot)" in the log.  c() is similar -- it's large, but
> < FreqInlineSize.  However, the inlining output says "too big", and c()
> isn't inlined.  Now, c() is *always* called when b() is called - it's a
> helper method (ironically, contains code moved out of b() to make b()
> smaller).  b() is also the only caller of c().
>
> So, if b() is "hot", why is c() not? Is it because compilation, and
> therefore inlining, started top-down here? CompileThreshold is the default
> here - 10000.  Is it the case that b() reaches 10k, but c() is at 9999
> still and is therefore not inlined?

Maybe b() gets compiled early which would mean we stop collecting
profile data at the call site for c() in b()? Is there a loop in b()
that would cause it be compiled before it's invoked 10k times?

Roland.

From vitalyd at gmail.com  Wed Sep 28 14:49:34 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Wed, 28 Sep 2016 10:49:34 -0400
Subject: Odd inlining failure
In-Reply-To: <dk660pg84y9.fsf@rwestrel.remote.csb>
References: <CAHjP37HtiaJpJyDF4BLbwKobvgs2q2=U3f7Lv+WpOPf3UTm=ng@mail.gmail.com>
	<dk6bmz889cj.fsf@rwestrel.remote.csb>
	<CAHjP37HW4dJ43vPzseL+jEOcW8p2+32sToQD4F6BcD4nY9c5iw@mail.gmail.com>
	<dk660pg84y9.fsf@rwestrel.remote.csb>
Message-ID: <CAHjP37H9QmjTHv0mHwkxjW8DjmgqNiBaa89HHfzCh6KTL=YZDg@mail.gmail.com>

Apologies, I accidentally dropped the list from my reply to Roland (quoted
below).

On Wed, Sep 28, 2016 at 10:38 AM, Roland Westrelin <rwestrel at redhat.com>
wrote:

>
> > In this case when b() is called its invocation count is +1 to c() because
> > c() is only called by b().  Now, a() has a big switch statement, with one
> > arm calling into b().  a() is called in a loop of sorts.  So I think the
> > switch arm calling b() gets hot and inlining starts.  But since inlining
> is
> > going top-down here, I suspect it's failing to inline helper methods,
> such
> > as c(), that are just as hot as b() but the 10000'th invocation hasn't
> been
> > recorded yet? This seems kind of broken though, if true, so I'm wondering
> > if I'm missing something.
> >
> > When recursively inlining, starting at a hot method, do the recursive
> > callsites, like c(), need to also have exactly 10,000 (or more)
> > invocations? What if it's, say, 9995?
>
> CompileThreshold=10000 is when compilation is triggered. It doesn't come
> into play to decide whether inlining happens or not. Also if you have
> loops, compilation is triggered when invocation counter + backedge
> counter exceeds CompileThreshold.
>
> Note also, that profiling (invocation counters at calls etc.) doesn't
> start until a method has been invoked a minimum number of times:
> InterpreterProfilePercentage % of CompileThreshold. So profiling doesn't
> start until invocation counter + backedge counter is greater than 3300
> by default with tiered off. If your method is inlined before it's been
> invoked 3300, all the call sites in the method are cold.
>
Ah, maybe that's the reason -- there's a loop in the outer method (a()), so
maybe that's the cause.  I'll need to look at the compilation log or the
 PrintMethodData that you suggested.

>
> And if methods are invoked by multiple threads, updates to the counters
> can be lost.
>
Single thread here.

>
> The code that triggers inlining is:
>
>   int call_site_count  = method()->scale_count(profile.count());
>   int invoke_count     = method()->interpreter_invocation_count();
>
>   assert(invoke_count != 0, "require invocation count greater than zero");
>   int freq = call_site_count / invoke_count;
>
>   // bump the max size if the call is frequent
>   if ((freq >= InlineFrequencyRatio) ||
>       (call_site_count >= InlineFrequencyCount) ||
>       is_unboxing_method(callee_method, C) ||
>       is_init_with_ea(callee_method, caller_method, C)) {
>
>     max_inline_size = C->freq_inline_size();
>     if (size <= max_inline_size && TraceFrequencyInlining) {
>       CompileTask::print_inline_indent(inline_level());
>       tty->print_cr("Inlined frequent method (freq=%d count=%d):", freq,
> call_site_count);
>       CompileTask::print_inline_indent(inline_level());
>       callee_method->print();
>       tty->cr();
>     }
>   } else {
>     // Not hot.  Check for medium-sized pre-existing nmethod at cold sites.
>     if (callee_method->has_compiled_code() &&
>         callee_method->instructions_size() > inline_small_code_size) {
>       set_msg("already compiled into a medium method");
>       return false;
>     }
>   }
>   if (size > max_inline_size) {
>     if (max_inline_size > default_max_inline_size) {
>       set_msg("hot method too big");
>     } else {
>       set_msg("too big");
>     }
>     return false;
>   }
>
> So a call site is hot if the call site count exceeds
> InlineFrequencyCount (100) or the frequency (ratio of number of time the
> call was taken and the number of time the caller was entered) exceeds
> InlineFrequencyRatio (20). InlineFrequencyCount is way below 10000.
>
> Do you have this as a simple test case that you can share?
>
I don't yet - I'll see if I can reproduce something.  As noted,
microbenchmarks/reduced test cases usually do the right thing but when
same/similar code shapes/call graphs are incorporated into a large app,
they don't.

>
> > I need to go look at the inlining heuristic code again, but maybe you
> know
> > offhand.
> >
> > As a general observation, I'm seeing lots of inlining failures, for a
> > variety of reasons, in a complex app where I think inlining would help.
> > The heuristics aren't doing the "right" thing.  I know there are a few
> > longstanding JBS entries around inlining, but I'm wondering if they will
> > ever be addressed or whether Graal simply takes over for C2.  I wonder if
> > Oracle or RedHat or anyone else looks at inlining output on large apps
> as a
> > way to assess its effect? Microbenchmarks are usually fine because the
> > profile is different, methods typically don't fail to inline because of
> > InlineSmallCode, etc.
> >
> > I know I'm preaching to the choir and I apologize for the semi-rant, but
> > inlining is paramount to Java performance, moreso than other languages
> (eg
> > C/C++) because of all the safety checks.  Given @ForceInline isn't really
> > available for end users, it's a huge pain and sometimes practically
> > impossible to convince C2 to inline something.
> >
> > I understand Graal has better inlining properties (I believe it pseudo
> > inlines to see if it's profitable, regardless of bytecode size).  Is that
> > the Hotspot answer to improved inlining?
> >
> > What the heck is everyone else doing for large apps with lots of hot
> > callsites? :) I can move some code around manually to outline some
> > (uncommon) code to slim down methods, but that's a hack IMO.
>
> You didn't send that email to the list. Was it intended?

Argh - no, that was unintentional.  I'm adding the list back in here.

> I'm curious
> what others would say. All I can say is that inlining heuristics are a
> known weakness of c2. Improving them is not a simple project. Also
> having graal on the horizon probably doesn't help: it could be a lot of
> work that will be of little value when graal is here, whenever that
> happens.
>
> Roland.
>
Thanks again
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160928/e2e12ee4/attachment.html>

From rednaxelafx at gmail.com  Wed Sep 28 15:37:19 2016
From: rednaxelafx at gmail.com (Krystal Mok)
Date: Wed, 28 Sep 2016 08:37:19 -0700
Subject: Odd interaction between ArrayList$Itr and Escape Analysis
In-Reply-To: <4c873846-5322-ebdf-5e0a-393089aea590@oracle.com>
References: <CAHjP37HcvD9JkhwrAjkNpV+vawyc1A=QgYD51SMLrLnJ90J4rw@mail.gmail.com>
	<CA+cQ+tQEeRDKsn0VffLfhXcbUqYd_vehtK9sEussoOkmTux4cA@mail.gmail.com>
	<CAHjP37HKRe9xhhkaWG--t8s=m7ZDvDu57Gt0qCEzi8_Q9K7dTA@mail.gmail.com>
	<CA+cQ+tT4Rx4B_Wr=D0YvmNG4YozsiJRC-tgasv_yTv-M==kYnA@mail.gmail.com>
	<CAHjP37GJQ02mMOT-TtFsoOhRJcjdd08CKp3WxpjYBnYri=7bdw@mail.gmail.com>
	<1619527975.952230.1473776309365.JavaMail.zimbra@u-pem.fr>
	<CAHjP37GGLcLOiWjPPsdMS85PvDcncFg9k9D5dDkCLnnq7ij=FA@mail.gmail.com>
	<CA+cQ+tRJDXRBC2M0_LfCf=XwWs7HrxJKjhsQpSVNW9hjyCY58A@mail.gmail.com>
	<4c873846-5322-ebdf-5e0a-393089aea590@oracle.com>
Message-ID: <CA+cQ+tQ_8cTOVGu34M5OeSYJhXt9JOD-OzgTqaE9RO1zSTMXAw@mail.gmail.com>

Hi guys,

Here's the HotSpot-side patch, based on OpenJDK9 HotSpot:

Webrev: http://cr.openjdk.java.net/~kmo/8166840/webrev.00/

Please give me a preliminary idea of how you guys feel about the patch, and
then I'll start an actual review thread if people agree on the direction of
this patch.

Note: This is the way javac constructs that "XXX$1" name for the
accessConstructorTag:
JDK7u:
http://hg.openjdk.java.net/jdk7u/jdk7u/langtools/file/93a2788178e6/src/share/classes/com/sun/tools/javac/comp/Lower.java#l1154
JDK9:
http://hg.openjdk.java.net/jdk9/jdk9/langtools/file/9f61004270d8/src/jdk.compiler/share/classes/com/sun/tools/javac/comp/Lower.java#l1241

So name matching on "$1" suffix is sufficient here to workaround this
particular pattern from javac.

P.S. I haven't built OpenJDK9 in quite a while now, and apparently the
makefiles have changed and the scripts that I used to build JDK7u / JDK8u
doesn't work on JDK9. What's the current recommended way to build just
HotSpot with fastdebug / product levels?

Thanks,
Kris (OpenJDK username: kmo)

On Wed, Sep 14, 2016 at 3:12 AM, Vladimir Ivanov <
vladimir.x.ivanov at oracle.com> wrote:

> Kris,
>
> And I'm happy to upstream that patch, if the team is interested.
>>
>
> Sure, we are definitely interested in fixing that. Feel free to file a bug
> and send the fix out for review.
>
> Now, when I first discovered the problem, my first intuition was that
>> it's better to "fix" it in javac. But before nest mates in the Class
>> file, there isn't much that javac could do. Changing the Java libraries
>> to not use private constructors in inner classes is also doable, but
>> needs changing a lot of files.
>>
>
> I agree that javac is not the best place to fix the immediate problem: it
> requires recompilation and there are already lots of problematic bytecode
> shapes out in the wild. The JVM should optimize for that case instead.
>
> So I ended up fixing it in the VM, even though I agree fully with what
>> R?mi brought up.
>>
>
> I'm curious how did you fix it. I haven't found a description in the
> thread.
>
> It's possible to force class loading, but I'm worried about undesirable
> effects of class initialization. Is it enough for C2 to have the class
> loaded but not initialized to make it work?
>
> Another approach would be to issue a null check and deoptimize (for bridge
> methods, the check collapses after inlining since the argument is always
> null) or add a nmethod dependency and throw away the code when the
> parameter class is loaded.
>
> Best regards,
> Vladimir Ivanov
>
> The access constructor tag thingy in javac is really a weird hack. If
>> you guys ever look at the contents of ArrayList$1, it's really empty
>> -- the class doesn't even declare some of the usual structures in a
>> normal Class file... Hopefully we can get rid of it in javac soon.
>>
>
> On Tuesday, September 13, 2016, Vitaly Davidovich <vitalyd at gmail.com
>> <mailto:vitalyd at gmail.com>> wrote:
>>
>>
>>
>>     On Tuesday, September 13, 2016, Remi Forax <forax at univ-mlv.fr
>>     <javascript:_e(%7B%7D,'cvml','forax at univ-mlv.fr');>> wrote:
>>
>>         I've always found that the empty inner classes generated by
>>         javac as a kind of hack.
>>
>>         These classes should be removed in Java 10, thanks to the
>>         nestmate attributes.
>>
>>         http://mail.openjdk.java.net/pipermail/valhalla-spec-experts
>> /2016-January/000060.html
>>         <http://mail.openjdk.java.net/pipermail/valhalla-spec-expert
>> s/2016-January/000060.html>
>>
>>         The other solution, is to have an empty class in the jdk which
>>         is not visible from javac (the class itself can be marked as
>>         synthetic),
>>         so javac can use it without creating method clash.
>>
>>         and to solve the problem now, the easy solution is to add a
>>         package private constructor in ArrayList.Itr,
>>
>>     I'm hoping Oracle can take Kris' (Azul) patch (or do something
>>     similar).  It might catch more cases than just modifying Itr.
>>
>>
>>         private class Itr implements Iterator<E> {
>>         int cursor; // index of next element to return
>>         int lastRet = -1; // index of last element returned; -1 if no such
>>         int expectedModCount = modCount;
>>
>>         Itr() {
>>         // avoid to generate a synthetic accessor constructor
>>         }
>>         }
>>
>>
>>         regards,
>>         R?mi
>>
>>         ------------------------------------------------------------
>> ------------
>>
>>             *De: *"Vitaly Davidovich" <vitalyd at gmail.com>
>>             *?: *"Krystal Mok" <rednaxelafx at gmail.com>
>>             *Cc: *"hotspot compiler" <hotspot-compiler-dev at openjdk.
>> java.net>
>>             *Envoy?: *Lundi 12 Septembre 2016 22:15:41
>>             *Objet: *Re: Odd interaction between ArrayList$Itr and
>>
>>             Escape Analysis
>>
>>
>>
>>             On Mon, Sep 12, 2016 at 3:56 PM, Krystal Mok
>>             <rednaxelafx at gmail.com> wrote:
>>
>>                 On Mon, Sep 12, 2016 at 12:38 PM, Vitaly Davidovich
>>                 <vitalyd at gmail.com> wrote:
>>
>>                     It seems odd to me as well why inlining won't force
>>                     load the missing class(es).  If we're inlining, it
>>                     means the method itself or the call chain it's part
>>                     of is hot - failing to inline can have negative
>>                     side-effects, like this example.  I suppose there
>>                     must be a good reason why it doesn't do this though?
>>
>>
>>                 That's because we can't. The JIT compilers are running
>>                 on their own threads, and they're not real "Java
>>                 threads". So they are not allowed to run arbitrary Java
>>                 code. But Java class loading may involve running
>>                 arbitrary Java code, e.g. the ClassLoader.loadClass()
>>                 upcall.
>>                 Force class loading can be done on the triggering side
>>                 (for the top-level method), because compilation tasks
>>                 are triggered from real Java threads, and they're
>>                 allowed to run arbitrary Java code.
>>
>>             I see, makes sense.  Perhaps there can be an option to turn
>>             on loading of required types in the entire compilation unit,
>>             after all inlining is done (and therefore make the unloaded
>>             types not be barriers for inlining).  I'd personally prefer
>>             that over having odd performance differences.
>>
>>
>>                 - Kris
>>
>>             Thanks Kris.
>>
>>
>>
>>     --
>>     Sent from my phone
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160928/514eb371/attachment-0001.html>

From vladimir.x.ivanov at oracle.com  Wed Sep 28 16:17:48 2016
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Wed, 28 Sep 2016 19:17:48 +0300
Subject: Odd interaction between ArrayList$Itr and Escape Analysis
In-Reply-To: <CA+cQ+tQ_8cTOVGu34M5OeSYJhXt9JOD-OzgTqaE9RO1zSTMXAw@mail.gmail.com>
References: <CAHjP37HcvD9JkhwrAjkNpV+vawyc1A=QgYD51SMLrLnJ90J4rw@mail.gmail.com>
	<CA+cQ+tQEeRDKsn0VffLfhXcbUqYd_vehtK9sEussoOkmTux4cA@mail.gmail.com>
	<CAHjP37HKRe9xhhkaWG--t8s=m7ZDvDu57Gt0qCEzi8_Q9K7dTA@mail.gmail.com>
	<CA+cQ+tT4Rx4B_Wr=D0YvmNG4YozsiJRC-tgasv_yTv-M==kYnA@mail.gmail.com>
	<CAHjP37GJQ02mMOT-TtFsoOhRJcjdd08CKp3WxpjYBnYri=7bdw@mail.gmail.com>
	<1619527975.952230.1473776309365.JavaMail.zimbra@u-pem.fr>
	<CAHjP37GGLcLOiWjPPsdMS85PvDcncFg9k9D5dDkCLnnq7ij=FA@mail.gmail.com>
	<CA+cQ+tRJDXRBC2M0_LfCf=XwWs7HrxJKjhsQpSVNW9hjyCY58A@mail.gmail.com>
	<4c873846-5322-ebdf-5e0a-393089aea590@oracle.com>
	<CA+cQ+tQ_8cTOVGu34M5OeSYJhXt9JOD-OzgTqaE9RO1zSTMXAw@mail.gmail.com>
Message-ID: <a869030d-c6c0-5abd-c746-b0ac4f130053@oracle.com>

Kris, thanks for sharing the patch!

IMO the problem we observe is not specific to bridge methods.

It demonstrates a generic short-coming in C2 inlining heuristic: even 
though the argument is never used (otherwise, the class would have been 
already loaded, right?), we don't inline the whole method.

So, I'd prefer to see a solution which covers the general case.

Can we do that? It seems so: it could be achieved by a null guard on the 
argument or a nmethod dependency on the unloaded class.

Best regards,
Vladimir Ivanov

On 9/28/16 6:37 PM, Krystal Mok wrote:
> Hi guys,
>
> Here's the HotSpot-side patch, based on OpenJDK9 HotSpot:
>
> Webrev: http://cr.openjdk.java.net/~kmo/8166840/webrev.00/
>
> Please give me a preliminary idea of how you guys feel about the patch,
> and then I'll start an actual review thread if people agree on the
> direction of this patch.
>
> Note: This is the way javac constructs that "XXX$1" name for the
> accessConstructorTag:
> JDK7u: http://hg.openjdk.java.net/jdk7u/jdk7u/langtools/file/93a2788178e6/src/share/classes/com/sun/tools/javac/comp/Lower.java#l1154
> JDK9:
> http://hg.openjdk.java.net/jdk9/jdk9/langtools/file/9f61004270d8/src/jdk.compiler/share/classes/com/sun/tools/javac/comp/Lower.java#l1241
>
> So name matching on "$1" suffix is sufficient here to workaround this
> particular pattern from javac.
>
> P.S. I haven't built OpenJDK9 in quite a while now, and apparently the
> makefiles have changed and the scripts that I used to build JDK7u /
> JDK8u doesn't work on JDK9. What's the current recommended way to build
> just HotSpot with fastdebug / product levels?
>
> Thanks,
> Kris (OpenJDK username: kmo)
>
> On Wed, Sep 14, 2016 at 3:12 AM, Vladimir Ivanov
> <vladimir.x.ivanov at oracle.com <mailto:vladimir.x.ivanov at oracle.com>> wrote:
>
>     Kris,
>
>         And I'm happy to upstream that patch, if the team is interested.
>
>
>     Sure, we are definitely interested in fixing that. Feel free to file
>     a bug and send the fix out for review.
>
>         Now, when I first discovered the problem, my first intuition was
>         that
>         it's better to "fix" it in javac. But before nest mates in the Class
>         file, there isn't much that javac could do. Changing the Java
>         libraries
>         to not use private constructors in inner classes is also doable, but
>         needs changing a lot of files.
>
>
>     I agree that javac is not the best place to fix the immediate
>     problem: it requires recompilation and there are already lots of
>     problematic bytecode shapes out in the wild. The JVM should optimize
>     for that case instead.
>
>         So I ended up fixing it in the VM, even though I agree fully
>         with what
>         R?mi brought up.
>
>
>     I'm curious how did you fix it. I haven't found a description in the
>     thread.
>
>     It's possible to force class loading, but I'm worried about
>     undesirable effects of class initialization. Is it enough for C2 to
>     have the class loaded but not initialized to make it work?
>
>     Another approach would be to issue a null check and deoptimize (for
>     bridge methods, the check collapses after inlining since the
>     argument is always null) or add a nmethod dependency and throw away
>     the code when the parameter class is loaded.
>
>     Best regards,
>     Vladimir Ivanov
>
>         The access constructor tag thingy in javac is really a weird
>         hack. If
>         you guys ever look at the contents of ArrayList$1, it's really empty
>         -- the class doesn't even declare some of the usual structures in a
>         normal Class file... Hopefully we can get rid of it in javac soon.
>
>
>         On Tuesday, September 13, 2016, Vitaly Davidovich
>         <vitalyd at gmail.com <mailto:vitalyd at gmail.com>
>         <mailto:vitalyd at gmail.com <mailto:vitalyd at gmail.com>>> wrote:
>
>
>
>             On Tuesday, September 13, 2016, Remi Forax
>         <forax at univ-mlv.fr <mailto:forax at univ-mlv.fr>
>             <javascript:_e(%7B%7D,'cvml','forax at univ-mlv.fr
>         <mailto:forax at univ-mlv.fr>');>> wrote:
>
>                 I've always found that the empty inner classes generated by
>                 javac as a kind of hack.
>
>                 These classes should be removed in Java 10, thanks to the
>                 nestmate attributes.
>
>
>         http://mail.openjdk.java.net/pipermail/valhalla-spec-experts/2016-January/000060.html
>         <http://mail.openjdk.java.net/pipermail/valhalla-spec-experts/2016-January/000060.html>
>
>         <http://mail.openjdk.java.net/pipermail/valhalla-spec-experts/2016-January/000060.html
>         <http://mail.openjdk.java.net/pipermail/valhalla-spec-experts/2016-January/000060.html>>
>
>                 The other solution, is to have an empty class in the jdk
>         which
>                 is not visible from javac (the class itself can be marked as
>                 synthetic),
>                 so javac can use it without creating method clash.
>
>                 and to solve the problem now, the easy solution is to add a
>                 package private constructor in ArrayList.Itr,
>
>             I'm hoping Oracle can take Kris' (Azul) patch (or do something
>             similar).  It might catch more cases than just modifying Itr.
>
>
>                 private class Itr implements Iterator<E> {
>                 int cursor; // index of next element to return
>                 int lastRet = -1; // index of last element returned; -1
>         if no such
>                 int expectedModCount = modCount;
>
>                 Itr() {
>                 // avoid to generate a synthetic accessor constructor
>                 }
>                 }
>
>
>                 regards,
>                 R?mi
>
>
>         ------------------------------------------------------------------------
>
>                     *De: *"Vitaly Davidovich" <vitalyd at gmail.com
>         <mailto:vitalyd at gmail.com>>
>                     *?: *"Krystal Mok" <rednaxelafx at gmail.com
>         <mailto:rednaxelafx at gmail.com>>
>                     *Cc: *"hotspot compiler"
>         <hotspot-compiler-dev at openjdk.java.net
>         <mailto:hotspot-compiler-dev at openjdk.java.net>>
>                     *Envoy?: *Lundi 12 Septembre 2016 22:15:41
>                     *Objet: *Re: Odd interaction between ArrayList$Itr and
>
>                     Escape Analysis
>
>
>
>                     On Mon, Sep 12, 2016 at 3:56 PM, Krystal Mok
>                     <rednaxelafx at gmail.com
>         <mailto:rednaxelafx at gmail.com>> wrote:
>
>                         On Mon, Sep 12, 2016 at 12:38 PM, Vitaly Davidovich
>                         <vitalyd at gmail.com <mailto:vitalyd at gmail.com>>
>         wrote:
>
>                             It seems odd to me as well why inlining
>         won't force
>                             load the missing class(es).  If we're
>         inlining, it
>                             means the method itself or the call chain
>         it's part
>                             of is hot - failing to inline can have negative
>                             side-effects, like this example.  I suppose
>         there
>                             must be a good reason why it doesn't do this
>         though?
>
>
>                         That's because we can't. The JIT compilers are
>         running
>                         on their own threads, and they're not real "Java
>                         threads". So they are not allowed to run
>         arbitrary Java
>                         code. But Java class loading may involve running
>                         arbitrary Java code, e.g. the
>         ClassLoader.loadClass()
>                         upcall.
>                         Force class loading can be done on the
>         triggering side
>                         (for the top-level method), because compilation
>         tasks
>                         are triggered from real Java threads, and they're
>                         allowed to run arbitrary Java code.
>
>                     I see, makes sense.  Perhaps there can be an option
>         to turn
>                     on loading of required types in the entire
>         compilation unit,
>                     after all inlining is done (and therefore make the
>         unloaded
>                     types not be barriers for inlining).  I'd personally
>         prefer
>                     that over having odd performance differences.
>
>
>                         - Kris
>
>                     Thanks Kris.
>
>
>
>             --
>             Sent from my phone
>
>

From rednaxelafx at gmail.com  Wed Sep 28 16:42:34 2016
From: rednaxelafx at gmail.com (Krystal Mok)
Date: Wed, 28 Sep 2016 09:42:34 -0700
Subject: Odd interaction between ArrayList$Itr and Escape Analysis
In-Reply-To: <a869030d-c6c0-5abd-c746-b0ac4f130053@oracle.com>
References: <CAHjP37HcvD9JkhwrAjkNpV+vawyc1A=QgYD51SMLrLnJ90J4rw@mail.gmail.com>
	<CA+cQ+tQEeRDKsn0VffLfhXcbUqYd_vehtK9sEussoOkmTux4cA@mail.gmail.com>
	<CAHjP37HKRe9xhhkaWG--t8s=m7ZDvDu57Gt0qCEzi8_Q9K7dTA@mail.gmail.com>
	<CA+cQ+tT4Rx4B_Wr=D0YvmNG4YozsiJRC-tgasv_yTv-M==kYnA@mail.gmail.com>
	<CAHjP37GJQ02mMOT-TtFsoOhRJcjdd08CKp3WxpjYBnYri=7bdw@mail.gmail.com>
	<1619527975.952230.1473776309365.JavaMail.zimbra@u-pem.fr>
	<CAHjP37GGLcLOiWjPPsdMS85PvDcncFg9k9D5dDkCLnnq7ij=FA@mail.gmail.com>
	<CA+cQ+tRJDXRBC2M0_LfCf=XwWs7HrxJKjhsQpSVNW9hjyCY58A@mail.gmail.com>
	<4c873846-5322-ebdf-5e0a-393089aea590@oracle.com>
	<CA+cQ+tQ_8cTOVGu34M5OeSYJhXt9JOD-OzgTqaE9RO1zSTMXAw@mail.gmail.com>
	<a869030d-c6c0-5abd-c746-b0ac4f130053@oracle.com>
Message-ID: <CA+cQ+tRkEVqT5jXmOYdMCDpPcCWUUB5e_ZLqX6CAVDmwFZGv4Q@mail.gmail.com>

Hi Vladimir,

Yes, the patch I posted was the short-term one that I used for getting rid
of this particular kind of problem before a release, and it's already in
production for us. So it was deliberately focused on a very narrow scenario
so that I don't have to worry about testing too much.

I do also have another patch for the general case for "unused unloaded
arguments". I haven't gotten around to polish and test that patch yet, but
since we're seeing a good motivation on the OpenJDK side as well, I may as
well go back and get that patch ready soon.

A null guard is a good way to go. It's basically the same kind of logic
that C2 OSR entry already uses. In this case, at a call site, a null guard
on the caller-side against an argument whose type is unloaded is one way to
do it.

(There are of course other alternatives. e.g. If we focus on the
callee-side, in a compiler with a mixed top-down / bottom-up inlining
heuristics system, the (devirtualized if needed) callee can be inspected
first to see if an argument of unloaded type is never used or not. If it is
never used, don't even bother inserting the null guard on the caller-side,
and just go ahead and inline would be good and safe. C2 doesn't have this
luxury yet so tackling the problem with a caller-side solution is easier to
do.)

IMO a nmethod dependency on an "unloaded class" isn't that feasible, since
you might not even have a concrete entity to "depend on", and registering
symbolic dependencies for "unloaded classes" in general, even though I
believe is doable, might be rather tedious.

Thanks,
Kris

On Wed, Sep 28, 2016 at 9:17 AM, Vladimir Ivanov <
vladimir.x.ivanov at oracle.com> wrote:

> Kris, thanks for sharing the patch!
>
> IMO the problem we observe is not specific to bridge methods.
>
> It demonstrates a generic short-coming in C2 inlining heuristic: even
> though the argument is never used (otherwise, the class would have been
> already loaded, right?), we don't inline the whole method.
>
> So, I'd prefer to see a solution which covers the general case.
>
> Can we do that? It seems so: it could be achieved by a null guard on the
> argument or a nmethod dependency on the unloaded class.
>
> Best regards,
> Vladimir Ivanov
>
> On 9/28/16 6:37 PM, Krystal Mok wrote:
>
>> Hi guys,
>>
>> Here's the HotSpot-side patch, based on OpenJDK9 HotSpot:
>>
>> Webrev: http://cr.openjdk.java.net/~kmo/8166840/webrev.00/
>>
>> Please give me a preliminary idea of how you guys feel about the patch,
>> and then I'll start an actual review thread if people agree on the
>> direction of this patch.
>>
>> Note: This is the way javac constructs that "XXX$1" name for the
>> accessConstructorTag:
>> JDK7u: http://hg.openjdk.java.net/jdk7u/jdk7u/langtools/file/93a278
>> 8178e6/src/share/classes/com/sun/tools/javac/comp/Lower.java#l1154
>> JDK9:
>> http://hg.openjdk.java.net/jdk9/jdk9/langtools/file/9f610042
>> 70d8/src/jdk.compiler/share/classes/com/sun/tools/javac/
>> comp/Lower.java#l1241
>>
>> So name matching on "$1" suffix is sufficient here to workaround this
>> particular pattern from javac.
>>
>> P.S. I haven't built OpenJDK9 in quite a while now, and apparently the
>> makefiles have changed and the scripts that I used to build JDK7u /
>> JDK8u doesn't work on JDK9. What's the current recommended way to build
>> just HotSpot with fastdebug / product levels?
>>
>> Thanks,
>> Kris (OpenJDK username: kmo)
>>
>> On Wed, Sep 14, 2016 at 3:12 AM, Vladimir Ivanov
>> <vladimir.x.ivanov at oracle.com <mailto:vladimir.x.ivanov at oracle.com>>
>> wrote:
>>
>>     Kris,
>>
>>         And I'm happy to upstream that patch, if the team is interested.
>>
>>
>>     Sure, we are definitely interested in fixing that. Feel free to file
>>     a bug and send the fix out for review.
>>
>>         Now, when I first discovered the problem, my first intuition was
>>         that
>>         it's better to "fix" it in javac. But before nest mates in the
>> Class
>>         file, there isn't much that javac could do. Changing the Java
>>         libraries
>>         to not use private constructors in inner classes is also doable,
>> but
>>         needs changing a lot of files.
>>
>>
>>     I agree that javac is not the best place to fix the immediate
>>     problem: it requires recompilation and there are already lots of
>>     problematic bytecode shapes out in the wild. The JVM should optimize
>>     for that case instead.
>>
>>         So I ended up fixing it in the VM, even though I agree fully
>>         with what
>>         R?mi brought up.
>>
>>
>>     I'm curious how did you fix it. I haven't found a description in the
>>     thread.
>>
>>     It's possible to force class loading, but I'm worried about
>>     undesirable effects of class initialization. Is it enough for C2 to
>>     have the class loaded but not initialized to make it work?
>>
>>     Another approach would be to issue a null check and deoptimize (for
>>     bridge methods, the check collapses after inlining since the
>>     argument is always null) or add a nmethod dependency and throw away
>>     the code when the parameter class is loaded.
>>
>>     Best regards,
>>     Vladimir Ivanov
>>
>>         The access constructor tag thingy in javac is really a weird
>>         hack. If
>>         you guys ever look at the contents of ArrayList$1, it's really
>> empty
>>         -- the class doesn't even declare some of the usual structures in
>> a
>>         normal Class file... Hopefully we can get rid of it in javac soon.
>>
>>
>>         On Tuesday, September 13, 2016, Vitaly Davidovich
>>         <vitalyd at gmail.com <mailto:vitalyd at gmail.com>
>>         <mailto:vitalyd at gmail.com <mailto:vitalyd at gmail.com>>> wrote:
>>
>>
>>
>>             On Tuesday, September 13, 2016, Remi Forax
>>         <forax at univ-mlv.fr <mailto:forax at univ-mlv.fr>
>>             <javascript:_e(%7B%7D,'cvml','forax at univ-mlv.fr
>>         <mailto:forax at univ-mlv.fr>');>> wrote:
>>
>>                 I've always found that the empty inner classes generated
>> by
>>                 javac as a kind of hack.
>>
>>                 These classes should be removed in Java 10, thanks to the
>>                 nestmate attributes.
>>
>>
>>         http://mail.openjdk.java.net/pipermail/valhalla-spec-experts
>> /2016-January/000060.html
>>         <http://mail.openjdk.java.net/pipermail/valhalla-spec-expert
>> s/2016-January/000060.html>
>>
>>         <http://mail.openjdk.java.net/pipermail/valhalla-spec-expert
>> s/2016-January/000060.html
>>         <http://mail.openjdk.java.net/pipermail/valhalla-spec-expert
>> s/2016-January/000060.html>>
>>
>>                 The other solution, is to have an empty class in the jdk
>>         which
>>                 is not visible from javac (the class itself can be marked
>> as
>>                 synthetic),
>>                 so javac can use it without creating method clash.
>>
>>                 and to solve the problem now, the easy solution is to add
>> a
>>                 package private constructor in ArrayList.Itr,
>>
>>             I'm hoping Oracle can take Kris' (Azul) patch (or do something
>>             similar).  It might catch more cases than just modifying Itr.
>>
>>
>>                 private class Itr implements Iterator<E> {
>>                 int cursor; // index of next element to return
>>                 int lastRet = -1; // index of last element returned; -1
>>         if no such
>>                 int expectedModCount = modCount;
>>
>>                 Itr() {
>>                 // avoid to generate a synthetic accessor constructor
>>                 }
>>                 }
>>
>>
>>                 regards,
>>                 R?mi
>>
>>
>>         ------------------------------------------------------------
>> ------------
>>
>>                     *De: *"Vitaly Davidovich" <vitalyd at gmail.com
>>         <mailto:vitalyd at gmail.com>>
>>                     *?: *"Krystal Mok" <rednaxelafx at gmail.com
>>         <mailto:rednaxelafx at gmail.com>>
>>                     *Cc: *"hotspot compiler"
>>         <hotspot-compiler-dev at openjdk.java.net
>>         <mailto:hotspot-compiler-dev at openjdk.java.net>>
>>                     *Envoy?: *Lundi 12 Septembre 2016 22:15:41
>>                     *Objet: *Re: Odd interaction between ArrayList$Itr and
>>
>>                     Escape Analysis
>>
>>
>>
>>                     On Mon, Sep 12, 2016 at 3:56 PM, Krystal Mok
>>                     <rednaxelafx at gmail.com
>>         <mailto:rednaxelafx at gmail.com>> wrote:
>>
>>                         On Mon, Sep 12, 2016 at 12:38 PM, Vitaly
>> Davidovich
>>                         <vitalyd at gmail.com <mailto:vitalyd at gmail.com>>
>>
>>         wrote:
>>
>>                             It seems odd to me as well why inlining
>>         won't force
>>                             load the missing class(es).  If we're
>>         inlining, it
>>                             means the method itself or the call chain
>>         it's part
>>                             of is hot - failing to inline can have
>> negative
>>                             side-effects, like this example.  I suppose
>>         there
>>                             must be a good reason why it doesn't do this
>>         though?
>>
>>
>>                         That's because we can't. The JIT compilers are
>>         running
>>                         on their own threads, and they're not real "Java
>>                         threads". So they are not allowed to run
>>         arbitrary Java
>>                         code. But Java class loading may involve running
>>                         arbitrary Java code, e.g. the
>>         ClassLoader.loadClass()
>>                         upcall.
>>                         Force class loading can be done on the
>>         triggering side
>>                         (for the top-level method), because compilation
>>         tasks
>>                         are triggered from real Java threads, and they're
>>                         allowed to run arbitrary Java code.
>>
>>                     I see, makes sense.  Perhaps there can be an option
>>         to turn
>>                     on loading of required types in the entire
>>         compilation unit,
>>                     after all inlining is done (and therefore make the
>>         unloaded
>>                     types not be barriers for inlining).  I'd personally
>>         prefer
>>                     that over having odd performance differences.
>>
>>
>>                         - Kris
>>
>>                     Thanks Kris.
>>
>>
>>
>>             --
>>             Sent from my phone
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160928/87f766d5/attachment-0001.html>

From vladimir.kozlov at oracle.com  Wed Sep 28 16:47:19 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 28 Sep 2016 09:47:19 -0700
Subject: RFR(S): 8166742 : SIGFPE in C2 Loop IV elimination
In-Reply-To: <CALFb4Kt1B6TPiCsiGd-R3OTL+6DTZVvEFwgp4cjvq4Q-fHLjJg@mail.gmail.com>
References: <CALFb4KuX4y0ftnww_+hnL2RurWnsyUNsm6_HhDKgCSYaNQhkcg@mail.gmail.com>
	<57E9A503.6090506@oracle.com>
	<CALFb4Ktyxt--39D-DQ=W+20n26t4JOyrvGEJ6q-+ZO77K2vPqg@mail.gmail.com>
	<57E9BEC5.2060308@oracle.com>
	<CALFb4KvUEtckd0kp_rKNGzJF7TVK7vnndws2WdEx6d=Pj6WkLw@mail.gmail.com>
	<274bdc7a-e7e5-882c-5715-32063f1e0f2c@oracle.com>
	<CALFb4Kt1B6TPiCsiGd-R3OTL+6DTZVvEFwgp4cjvq4Q-fHLjJg@mail.gmail.com>
Message-ID: <eb679309-cd0f-d95f-1fc6-935bf33a31bb@oracle.com>

On 9/27/16 1:57 PM, Chuck Rasbold wrote:
> Sorry for not being transparent enough.  Here's an external reference
> that describes the problem
> that is being encountered by the division:
>
> https://www.gnu.org/software/autoconf/manual/autoconf-2.67/html_node/Signed-Integer-Division.html
>
> That's why the original fix targeted a very specific case.  One can't
> represent ratio_con as a 32 bit value in that case.
> Worse, trying to compute it by division causes a SIGFPE.
>
> Do you think the revised code below is as straightforward as the original?

Okay, looks like it is very special only one case and not a range of 
cases. Lets use your original fix then.

I will sponsor it.

Thanks,
Vladimir

>
> -- Chuck
>
> On Tue, Sep 27, 2016 at 9:48 AM, Vladimir Kozlov
> <vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>> wrote:
>
>     So why it is SIGFPE when both values are 'int'?
>
>     I thought it is incorrect results cause SIGFPE that is why I
>     suggested to check for integer overflow.
>
>     Lets then go with your second suggested change here. But let check
>     that ratio is small first and do cast to (jint) otherwise the long
>     check is useless:
>
>       // The ratio of the two strides cannot be represented as an int
>       // if stride_con2 is min_int and stride_con is -1.
>       jlong ratio_conl = ((jlong)stride_con2 / stride_con);
>
>       if ((ratio_conl < 0x80000000L) &&
>           (jint)(ratio_conl * stride_con) == stride_con2) { // Check for
>     exact
>          jint ratio_con = (jint)ratio_conl;
>
>     Thanks,
>     Vladimir
>
>     On 9/27/16 7:56 AM, Chuck Rasbold wrote:
>
>
>
>         On Mon, Sep 26, 2016 at 5:35 PM, Vladimir Kozlov
>         <vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>
>         <mailto:vladimir.kozlov at oracle.com
>         <mailto:vladimir.kozlov at oracle.com>>> wrote:
>
>             Slightly different (cast after /) and jlong type:
>
>               jlong ratio_conl = (jlong) (stride_con2 / stride_con);
>
>
>         The division above won't work (at least, it raises a SIGFPE on
>         my Linux
>         x86 platform) when stride_con2 == min_jint   and stride_con == -1.
>
>
>               if ((ratio_conl * stride_con) == (jlong)stride_con2) { //
>         Check
>             for exact
>
>
>         What would be the value of ratio_conl such that this test fails?  I
>         think I'm missing something...
>
>         -- Chuck
>
>
>             Vladimir
>
>             On 9/26/16 5:01 PM, Chuck Rasbold wrote:
>
>                 Just to confirm, are you suggesting that the ratio be first
>                 computed as a 64 bit quantity, effectively along the
>         lines of...
>
>                    long ratio_conl = ((long) stride_con2) / stride_con;
>
>                    if ((ratio_conl * stride_con) == stride_con2 &&
>                        ratio_conl < 0x8000000 ) { // Check for exact
>                      int ratio_con = (int) ratio_conl;
>
>
>                 On Mon, Sep 26, 2016 at 3:45 PM, Vladimir Kozlov
>                 <vladimir.kozlov at oracle.com
>         <mailto:vladimir.kozlov at oracle.com>
>         <mailto:vladimir.kozlov at oracle.com
>         <mailto:vladimir.kozlov at oracle.com>>
>                 <mailto:vladimir.kozlov at oracle.com
>         <mailto:vladimir.kozlov at oracle.com>
>                 <mailto:vladimir.kozlov at oracle.com
>         <mailto:vladimir.kozlov at oracle.com>>>> wrote:
>
>                     Hi Chuck
>
>                     Can you do 'long' arithmetic in existing condition
>         to catch
>                 integer overflow instead?
>
>                     if ((ratio_con * stride_con) == stride_con2) { //
>         Check for
>                 exact
>
>                     thanks,
>                     Vladimir
>
>
>
>                     On 9/26/16 3:18 PM, Chuck Rasbold wrote:
>
>                         A small fix for an edge case crash in C2...
>
>                         Bug:
>         https://bugs.openjdk.java.net/browse/JDK-8166742
>         <https://bugs.openjdk.java.net/browse/JDK-8166742>
>                 <https://bugs.openjdk.java.net/browse/JDK-8166742
>         <https://bugs.openjdk.java.net/browse/JDK-8166742>>
>                 <https://bugs.openjdk.java.net/browse/JDK-8166742
>         <https://bugs.openjdk.java.net/browse/JDK-8166742>
>                 <https://bugs.openjdk.java.net/browse/JDK-8166742
>         <https://bugs.openjdk.java.net/browse/JDK-8166742>>>
>                         Webrev:
>                 http://cr.openjdk.java.net/~rasbold/8166742/webrev.00/
>         <http://cr.openjdk.java.net/~rasbold/8166742/webrev.00/>
>                 <http://cr.openjdk.java.net/~rasbold/8166742/webrev.00/
>         <http://cr.openjdk.java.net/~rasbold/8166742/webrev.00/>>
>                 <http://cr.openjdk.java.net/~rasbold/8166742/webrev.00/
>         <http://cr.openjdk.java.net/~rasbold/8166742/webrev.00/>
>                 <http://cr.openjdk.java.net/~rasbold/8166742/webrev.00/
>         <http://cr.openjdk.java.net/~rasbold/8166742/webrev.00/>>>
>
>                         Requesting a sponsor and reviews.  Thanks.
>
>                         -- Chuck
>
>
>
>

From rasbold at google.com  Wed Sep 28 17:02:00 2016
From: rasbold at google.com (Chuck Rasbold)
Date: Wed, 28 Sep 2016 10:02:00 -0700
Subject: RFR(S): 8166742 : SIGFPE in C2 Loop IV elimination
In-Reply-To: <eb679309-cd0f-d95f-1fc6-935bf33a31bb@oracle.com>
References: <CALFb4KuX4y0ftnww_+hnL2RurWnsyUNsm6_HhDKgCSYaNQhkcg@mail.gmail.com>
	<57E9A503.6090506@oracle.com>
	<CALFb4Ktyxt--39D-DQ=W+20n26t4JOyrvGEJ6q-+ZO77K2vPqg@mail.gmail.com>
	<57E9BEC5.2060308@oracle.com>
	<CALFb4KvUEtckd0kp_rKNGzJF7TVK7vnndws2WdEx6d=Pj6WkLw@mail.gmail.com>
	<274bdc7a-e7e5-882c-5715-32063f1e0f2c@oracle.com>
	<CALFb4Kt1B6TPiCsiGd-R3OTL+6DTZVvEFwgp4cjvq4Q-fHLjJg@mail.gmail.com>
	<eb679309-cd0f-d95f-1fc6-935bf33a31bb@oracle.com>
Message-ID: <CALFb4KuH6ULVSqr-is28bVK+AhAcaNu0_pzdmG=LgPZegV7TvA@mail.gmail.com>

Thanks, Vladimir!

On Wed, Sep 28, 2016 at 9:47 AM, Vladimir Kozlov <vladimir.kozlov at oracle.com
> wrote:

> On 9/27/16 1:57 PM, Chuck Rasbold wrote:
>
>> Sorry for not being transparent enough.  Here's an external reference
>> that describes the problem
>> that is being encountered by the division:
>>
>> https://www.gnu.org/software/autoconf/manual/autoconf-2.67/h
>> tml_node/Signed-Integer-Division.html
>>
>> That's why the original fix targeted a very specific case.  One can't
>> represent ratio_con as a 32 bit value in that case.
>> Worse, trying to compute it by division causes a SIGFPE.
>>
>> Do you think the revised code below is as straightforward as the original?
>>
>
> Okay, looks like it is very special only one case and not a range of
> cases. Lets use your original fix then.
>
> I will sponsor it.
>
> Thanks,
> Vladimir
>
>
>> -- Chuck
>>
>> On Tue, Sep 27, 2016 at 9:48 AM, Vladimir Kozlov
>> <vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>> wrote:
>>
>>     So why it is SIGFPE when both values are 'int'?
>>
>>     I thought it is incorrect results cause SIGFPE that is why I
>>     suggested to check for integer overflow.
>>
>>     Lets then go with your second suggested change here. But let check
>>     that ratio is small first and do cast to (jint) otherwise the long
>>     check is useless:
>>
>>       // The ratio of the two strides cannot be represented as an int
>>       // if stride_con2 is min_int and stride_con is -1.
>>       jlong ratio_conl = ((jlong)stride_con2 / stride_con);
>>
>>       if ((ratio_conl < 0x80000000L) &&
>>           (jint)(ratio_conl * stride_con) == stride_con2) { // Check for
>>     exact
>>          jint ratio_con = (jint)ratio_conl;
>>
>>     Thanks,
>>     Vladimir
>>
>>     On 9/27/16 7:56 AM, Chuck Rasbold wrote:
>>
>>
>>
>>         On Mon, Sep 26, 2016 at 5:35 PM, Vladimir Kozlov
>>         <vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>
>>         <mailto:vladimir.kozlov at oracle.com
>>         <mailto:vladimir.kozlov at oracle.com>>> wrote:
>>
>>             Slightly different (cast after /) and jlong type:
>>
>>               jlong ratio_conl = (jlong) (stride_con2 / stride_con);
>>
>>
>>         The division above won't work (at least, it raises a SIGFPE on
>>         my Linux
>>         x86 platform) when stride_con2 == min_jint   and stride_con == -1.
>>
>>
>>               if ((ratio_conl * stride_con) == (jlong)stride_con2) { //
>>         Check
>>             for exact
>>
>>
>>         What would be the value of ratio_conl such that this test fails?
>> I
>>         think I'm missing something...
>>
>>         -- Chuck
>>
>>
>>             Vladimir
>>
>>             On 9/26/16 5:01 PM, Chuck Rasbold wrote:
>>
>>                 Just to confirm, are you suggesting that the ratio be
>> first
>>                 computed as a 64 bit quantity, effectively along the
>>         lines of...
>>
>>                    long ratio_conl = ((long) stride_con2) / stride_con;
>>
>>                    if ((ratio_conl * stride_con) == stride_con2 &&
>>                        ratio_conl < 0x8000000 ) { // Check for exact
>>                      int ratio_con = (int) ratio_conl;
>>
>>
>>                 On Mon, Sep 26, 2016 at 3:45 PM, Vladimir Kozlov
>>                 <vladimir.kozlov at oracle.com
>>         <mailto:vladimir.kozlov at oracle.com>
>>         <mailto:vladimir.kozlov at oracle.com
>>         <mailto:vladimir.kozlov at oracle.com>>
>>                 <mailto:vladimir.kozlov at oracle.com
>>         <mailto:vladimir.kozlov at oracle.com>
>>                 <mailto:vladimir.kozlov at oracle.com
>>         <mailto:vladimir.kozlov at oracle.com>>>> wrote:
>>
>>                     Hi Chuck
>>
>>                     Can you do 'long' arithmetic in existing condition
>>         to catch
>>                 integer overflow instead?
>>
>>                     if ((ratio_con * stride_con) == stride_con2) { //
>>         Check for
>>                 exact
>>
>>                     thanks,
>>                     Vladimir
>>
>>
>>
>>                     On 9/26/16 3:18 PM, Chuck Rasbold wrote:
>>
>>                         A small fix for an edge case crash in C2...
>>
>>                         Bug:
>>         https://bugs.openjdk.java.net/browse/JDK-8166742
>>         <https://bugs.openjdk.java.net/browse/JDK-8166742>
>>                 <https://bugs.openjdk.java.net/browse/JDK-8166742
>>         <https://bugs.openjdk.java.net/browse/JDK-8166742>>
>>                 <https://bugs.openjdk.java.net/browse/JDK-8166742
>>         <https://bugs.openjdk.java.net/browse/JDK-8166742>
>>                 <https://bugs.openjdk.java.net/browse/JDK-8166742
>>         <https://bugs.openjdk.java.net/browse/JDK-8166742>>>
>>                         Webrev:
>>                 http://cr.openjdk.java.net/~rasbold/8166742/webrev.00/
>>         <http://cr.openjdk.java.net/~rasbold/8166742/webrev.00/>
>>                 <http://cr.openjdk.java.net/~rasbold/8166742/webrev.00/
>>         <http://cr.openjdk.java.net/~rasbold/8166742/webrev.00/>>
>>                 <http://cr.openjdk.java.net/~rasbold/8166742/webrev.00/
>>         <http://cr.openjdk.java.net/~rasbold/8166742/webrev.00/>
>>                 <http://cr.openjdk.java.net/~rasbold/8166742/webrev.00/
>>         <http://cr.openjdk.java.net/~rasbold/8166742/webrev.00/>>>
>>
>>                         Requesting a sponsor and reviews.  Thanks.
>>
>>                         -- Chuck
>>
>>
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160928/043aff4f/attachment.html>

From vladimir.x.ivanov at oracle.com  Wed Sep 28 17:10:33 2016
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Wed, 28 Sep 2016 20:10:33 +0300
Subject: Odd interaction between ArrayList$Itr and Escape Analysis
In-Reply-To: <CA+cQ+tRkEVqT5jXmOYdMCDpPcCWUUB5e_ZLqX6CAVDmwFZGv4Q@mail.gmail.com>
References: <CAHjP37HcvD9JkhwrAjkNpV+vawyc1A=QgYD51SMLrLnJ90J4rw@mail.gmail.com>
	<CA+cQ+tQEeRDKsn0VffLfhXcbUqYd_vehtK9sEussoOkmTux4cA@mail.gmail.com>
	<CAHjP37HKRe9xhhkaWG--t8s=m7ZDvDu57Gt0qCEzi8_Q9K7dTA@mail.gmail.com>
	<CA+cQ+tT4Rx4B_Wr=D0YvmNG4YozsiJRC-tgasv_yTv-M==kYnA@mail.gmail.com>
	<CAHjP37GJQ02mMOT-TtFsoOhRJcjdd08CKp3WxpjYBnYri=7bdw@mail.gmail.com>
	<1619527975.952230.1473776309365.JavaMail.zimbra@u-pem.fr>
	<CAHjP37GGLcLOiWjPPsdMS85PvDcncFg9k9D5dDkCLnnq7ij=FA@mail.gmail.com>
	<CA+cQ+tRJDXRBC2M0_LfCf=XwWs7HrxJKjhsQpSVNW9hjyCY58A@mail.gmail.com>
	<4c873846-5322-ebdf-5e0a-393089aea590@oracle.com>
	<CA+cQ+tQ_8cTOVGu34M5OeSYJhXt9JOD-OzgTqaE9RO1zSTMXAw@mail.gmail.com>
	<a869030d-c6c0-5abd-c746-b0ac4f130053@oracle.com>
	<CA+cQ+tRkEVqT5jXmOYdMCDpPcCWUUB5e_ZLqX6CAVDmwFZGv4Q@mail.gmail.com>
Message-ID: <78ec340d-7c56-233a-5c0f-4f60b7bdda89@oracle.com>

Kris,

> A null guard is a good way to go. It's basically the same kind of logic
> that C2 OSR entry already uses. In this case, at a call site, a null
> guard on the caller-side against an argument whose type is unloaded is
> one way to do it.

For the case when argument value (null) is a compile-time constant, the 
guard collapses right away. So, it sounds like a good solution.

> (There are of course other alternatives. e.g. If we focus on the
> callee-side, in a compiler with a mixed top-down / bottom-up inlining
> heuristics system, the (devirtualized if needed) callee can be inspected
> first to see if an argument of unloaded type is never used or not. If it
> is never used, don't even bother inserting the null guard on the
> caller-side, and just go ahead and inline would be good and safe. C2
> doesn't have this luxury yet so tackling the problem with a caller-side
> solution is easier to do.)
>
> IMO a nmethod dependency on an "unloaded class" isn't that feasible,
> since you might not even have a concrete entity to "depend on", and
> registering symbolic dependencies for "unloaded classes" in general,
> even though I believe is doable, might be rather tedious.

Agree, it requires a new flavor of nmethod dependency.

Best regards,
Vladimir Ivanov

From rednaxelafx at gmail.com  Wed Sep 28 17:15:53 2016
From: rednaxelafx at gmail.com (Krystal Mok)
Date: Wed, 28 Sep 2016 10:15:53 -0700
Subject: Odd interaction between ArrayList$Itr and Escape Analysis
In-Reply-To: <78ec340d-7c56-233a-5c0f-4f60b7bdda89@oracle.com>
References: <CAHjP37HcvD9JkhwrAjkNpV+vawyc1A=QgYD51SMLrLnJ90J4rw@mail.gmail.com>
	<CA+cQ+tQEeRDKsn0VffLfhXcbUqYd_vehtK9sEussoOkmTux4cA@mail.gmail.com>
	<CAHjP37HKRe9xhhkaWG--t8s=m7ZDvDu57Gt0qCEzi8_Q9K7dTA@mail.gmail.com>
	<CA+cQ+tT4Rx4B_Wr=D0YvmNG4YozsiJRC-tgasv_yTv-M==kYnA@mail.gmail.com>
	<CAHjP37GJQ02mMOT-TtFsoOhRJcjdd08CKp3WxpjYBnYri=7bdw@mail.gmail.com>
	<1619527975.952230.1473776309365.JavaMail.zimbra@u-pem.fr>
	<CAHjP37GGLcLOiWjPPsdMS85PvDcncFg9k9D5dDkCLnnq7ij=FA@mail.gmail.com>
	<CA+cQ+tRJDXRBC2M0_LfCf=XwWs7HrxJKjhsQpSVNW9hjyCY58A@mail.gmail.com>
	<4c873846-5322-ebdf-5e0a-393089aea590@oracle.com>
	<CA+cQ+tQ_8cTOVGu34M5OeSYJhXt9JOD-OzgTqaE9RO1zSTMXAw@mail.gmail.com>
	<a869030d-c6c0-5abd-c746-b0ac4f130053@oracle.com>
	<CA+cQ+tRkEVqT5jXmOYdMCDpPcCWUUB5e_ZLqX6CAVDmwFZGv4Q@mail.gmail.com>
	<78ec340d-7c56-233a-5c0f-4f60b7bdda89@oracle.com>
Message-ID: <CA+cQ+tQ1=kS+mrshZjD_uTmgtoqeiA1yp6X0AS3QPRpnL9LqWw@mail.gmail.com>

On Wed, Sep 28, 2016 at 10:10 AM, Vladimir Ivanov <
vladimir.x.ivanov at oracle.com> wrote:

> Kris,
>
> A null guard is a good way to go. It's basically the same kind of logic
>> that C2 OSR entry already uses. In this case, at a call site, a null
>> guard on the caller-side against an argument whose type is unloaded is
>> one way to do it.
>>
>
> For the case when argument value (null) is a compile-time constant, the
> guard collapses right away. So, it sounds like a good solution.


That's exactly what I'm doing with my other patch. Let me prepare that and
send it out for review sometime this weekend.

Thanks,
Kris


>
>
> (There are of course other alternatives. e.g. If we focus on the
>> callee-side, in a compiler with a mixed top-down / bottom-up inlining
>> heuristics system, the (devirtualized if needed) callee can be inspected
>> first to see if an argument of unloaded type is never used or not. If it
>> is never used, don't even bother inserting the null guard on the
>> caller-side, and just go ahead and inline would be good and safe. C2
>> doesn't have this luxury yet so tackling the problem with a caller-side
>> solution is easier to do.)
>>
>> IMO a nmethod dependency on an "unloaded class" isn't that feasible,
>> since you might not even have a concrete entity to "depend on", and
>> registering symbolic dependencies for "unloaded classes" in general,
>> even though I believe is doable, might be rather tedious.
>>
>
> Agree, it requires a new flavor of nmethod dependency.
>
> Best regards,
> Vladimir Ivanov
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160928/0e62a38e/attachment.html>

From john.r.rose at oracle.com  Wed Sep 28 17:37:58 2016
From: john.r.rose at oracle.com (John Rose)
Date: Wed, 28 Sep 2016 10:37:58 -0700
Subject: Odd interaction between ArrayList$Itr and Escape Analysis
In-Reply-To: <CA+cQ+tRkEVqT5jXmOYdMCDpPcCWUUB5e_ZLqX6CAVDmwFZGv4Q@mail.gmail.com>
References: <CAHjP37HcvD9JkhwrAjkNpV+vawyc1A=QgYD51SMLrLnJ90J4rw@mail.gmail.com>
	<CA+cQ+tQEeRDKsn0VffLfhXcbUqYd_vehtK9sEussoOkmTux4cA@mail.gmail.com>
	<CAHjP37HKRe9xhhkaWG--t8s=m7ZDvDu57Gt0qCEzi8_Q9K7dTA@mail.gmail.com>
	<CA+cQ+tT4Rx4B_Wr=D0YvmNG4YozsiJRC-tgasv_yTv-M==kYnA@mail.gmail.com>
	<CAHjP37GJQ02mMOT-TtFsoOhRJcjdd08CKp3WxpjYBnYri=7bdw@mail.gmail.com>
	<1619527975.952230.1473776309365.JavaMail.zimbra@u-pem.fr>
	<CAHjP37GGLcLOiWjPPsdMS85PvDcncFg9k9D5dDkCLnnq7ij=FA@mail.gmail.com>
	<CA+cQ+tRJDXRBC2M0_LfCf=XwWs7HrxJKjhsQpSVNW9hjyCY58A@mail.gmail.com>
	<4c873846-5322-ebdf-5e0a-393089aea590@oracle.com>
	<CA+cQ+tQ_8cTOVGu34M5OeSYJhXt9JOD-OzgTqaE9RO1zSTMXAw@mail.gmail.com>
	<a869030d-c6c0-5abd-c746-b0ac4f130053@oracle.com>
	<CA+cQ+tRkEVqT5jXmOYdMCDpPcCWUUB5e_ZLqX6CAVDmwFZGv4Q@mail.gmail.com>
Message-ID: <AA205B21-7FE5-4E87-BA05-51BA96092C56@oracle.com>

On Sep 28, 2016, at 9:42 AM, Krystal Mok <rednaxelafx at gmail.com> wrote:
> 
> I do also have another patch for the general case for "unused unloaded arguments". I haven't gotten around to polish and test that patch yet, but since we're seeing a good motivation on the OpenJDK side as well, I may as well go back and get that patch ready soon.
> 
> A null guard is a good way to go. It's basically the same kind of logic that C2 OSR entry already uses. In this case, at a call site, a null guard on the caller-side against an argument whose type is unloaded is one way to do it.

This is the fix I would prefer for the inliner.

> (There are of course other alternatives. e.g. If we focus on the callee-side, in a compiler with a mixed top-down / bottom-up inlining heuristics system, the (devirtualized if needed) callee can be inspected first to see if an argument of unloaded type is never used or not. If it is never used, don't even bother inserting the null guard on the caller-side, and just go ahead and inline would be good and safe. C2 doesn't have this luxury yet so tackling the problem with a caller-side solution is easier to do.)

I'd like to do more in this direction.  The EA function summarizer could be overloaded to also gather data on the usage of arguments (as well as their escape status).  For example, if an argument is used to gate a branch (somehow), then having that argument be constant should "add points" to the heuristic that decides inlining.  *In general*, constant arguments should be an "argument" to raise the likelihood of inlining a call.

I'm going to guess that this work would be better done in Graal, but we don't have that luxury yet.

? John
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160928/4ac92e2f/attachment.html>

From vitalyd at gmail.com  Wed Sep 28 17:46:03 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Wed, 28 Sep 2016 13:46:03 -0400
Subject: Odd interaction between ArrayList$Itr and Escape Analysis
In-Reply-To: <AA205B21-7FE5-4E87-BA05-51BA96092C56@oracle.com>
References: <CAHjP37HcvD9JkhwrAjkNpV+vawyc1A=QgYD51SMLrLnJ90J4rw@mail.gmail.com>
	<CA+cQ+tQEeRDKsn0VffLfhXcbUqYd_vehtK9sEussoOkmTux4cA@mail.gmail.com>
	<CAHjP37HKRe9xhhkaWG--t8s=m7ZDvDu57Gt0qCEzi8_Q9K7dTA@mail.gmail.com>
	<CA+cQ+tT4Rx4B_Wr=D0YvmNG4YozsiJRC-tgasv_yTv-M==kYnA@mail.gmail.com>
	<CAHjP37GJQ02mMOT-TtFsoOhRJcjdd08CKp3WxpjYBnYri=7bdw@mail.gmail.com>
	<1619527975.952230.1473776309365.JavaMail.zimbra@u-pem.fr>
	<CAHjP37GGLcLOiWjPPsdMS85PvDcncFg9k9D5dDkCLnnq7ij=FA@mail.gmail.com>
	<CA+cQ+tRJDXRBC2M0_LfCf=XwWs7HrxJKjhsQpSVNW9hjyCY58A@mail.gmail.com>
	<4c873846-5322-ebdf-5e0a-393089aea590@oracle.com>
	<CA+cQ+tQ_8cTOVGu34M5OeSYJhXt9JOD-OzgTqaE9RO1zSTMXAw@mail.gmail.com>
	<a869030d-c6c0-5abd-c746-b0ac4f130053@oracle.com>
	<CA+cQ+tRkEVqT5jXmOYdMCDpPcCWUUB5e_ZLqX6CAVDmwFZGv4Q@mail.gmail.com>
	<AA205B21-7FE5-4E87-BA05-51BA96092C56@oracle.com>
Message-ID: <CAHjP37HXi3qEijDtuptfOxQfZMz-zpndn-Dv2ogKY0LsGTYXhA@mail.gmail.com>

On Wed, Sep 28, 2016 at 1:37 PM, John Rose <john.r.rose at oracle.com> wrote:

> On Sep 28, 2016, at 9:42 AM, Krystal Mok <rednaxelafx at gmail.com> wrote:
>
>
> I do also have another patch for the general case for "unused unloaded
> arguments". I haven't gotten around to polish and test that patch yet, but
> since we're seeing a good motivation on the OpenJDK side as well, I may as
> well go back and get that patch ready soon.
>
> A null guard is a good way to go. It's basically the same kind of logic
> that C2 OSR entry already uses. In this case, at a call site, a null guard
> on the caller-side against an argument whose type is unloaded is one way to
> do it.
>
>
> This is the fix I would prefer for the inliner.
>
> (There are of course other alternatives. e.g. If we focus on the
> callee-side, in a compiler with a mixed top-down / bottom-up inlining
> heuristics system, the (devirtualized if needed) callee can be inspected
> first to see if an argument of unloaded type is never used or not. If it is
> never used, don't even bother inserting the null guard on the caller-side,
> and just go ahead and inline would be good and safe. C2 doesn't have this
> luxury yet so tackling the problem with a caller-side solution is easier to
> do.)
>
>
> I'd like to do more in this direction.  The EA function summarizer could
> be overloaded to also gather data on the usage of arguments (as well as
> their escape status).  For example, if an argument is used to gate a branch
> (somehow), then having that argument be constant should "add points" to the
> heuristic that decides inlining.  *In general*, constant arguments should
> be an "argument" to raise the likelihood of inlining a call.
>
Yes! I'm a bit surprised we've gone so long without constants adding bonus
points for inlining.

I'm definitely seeing places where a callsite isn't inlined for one reason
or another, but there's a constant (sometimes several) being passed through
which would end up folding a bunch of code in the callee, and sometimes
eliminating code there altogether.

>
> I'm going to guess that this work would be better done in Graal, but we
> don't have that luxury yet.
>
> ? John
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160928/352c5b9c/attachment.html>

From vladimir.kozlov at oracle.com  Wed Sep 28 18:00:53 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 28 Sep 2016 11:00:53 -0700
Subject: RFR(XS): 8166806: Add intrinsic support for writer used in event
	based tracing
In-Reply-To: <da213d10-a154-4d84-99e8-2cf030ec51fa@default>
References: <da213d10-a154-4d84-99e8-2cf030ec51fa@default>
Message-ID: <08a3d3a5-5d59-b18d-4aa6-82a2662408d3@oracle.com>

Hi Markus,

Where _getBufferWriter is defined? I don't see closed changes.

c1_LIRGenerator.cpp: should you use oopConst(NULL) in compare?

library_call.cpp: TypeInstPtr::MIRROR is useless since the result phi 
type is TypePtr::BOTTOM. Using TypePtr::BOTTOM for load could be less 
bug prone.

Thanks,
Vladimir

On 9/27/16 1:58 PM, Markus Gronlund wrote:
> Greetings,
>
>
>
> Kindly asking for reviews for the following change:
>
>
>
> Bug: http://bugs.openjdk.java.net/browse/JDK-8166806
>
> Webrev: http://cr.openjdk.java.net/~mgronlun/8166806/webrev/
>
>
>
> Thanks in advance
>
> Markus
>

From vladimir.kozlov at oracle.com  Wed Sep 28 18:38:39 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 28 Sep 2016 11:38:39 -0700
Subject: RFR(S): 8166836: Elimination of clone's ArrayCopyNode may make
	compilation fail silently
In-Reply-To: <182f42a0-caca-ea5b-0e7d-3ebaff5f1bc1@oracle.com>
References: <dk6eg448j8k.fsf@rwestrel.remote.csb>
	<182f42a0-caca-ea5b-0e7d-3ebaff5f1bc1@oracle.com>
Message-ID: <d435e4a0-b6c4-81b4-f321-ff28f28da335@oracle.com>

I thought we do that. There are several places in loopnode.cpp where we 
hit assert if graph is bad.

Thanks,
Vladimir

On 9/28/16 3:04 AM, Vladimir Ivanov wrote:
>> http://cr.openjdk.java.net/~roland/8166836/webrev.00/
>
> Looks good.
>
>> It's quite unfortunate that this wasn't found by testing because
>> compilations where the graph is non schedulable simply fail. This could
>> have gone unnoticed much longer. In debug builds shouldn't we abort the
>> VM in C2Compiler::compile_method() if the compilation fails because of a
>> non schedulable graph?
>
> Sounds reasonable. I expect there are other cases when compilers bail
> out unexpectely. It would be good to have an assert checking it doesn't
> happen.
>
> Best regards,
> Vladimir Ivanov

From vladimir.kozlov at oracle.com  Wed Sep 28 18:44:59 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 28 Sep 2016 11:44:59 -0700
Subject: RFR: 8166781: fix wrong comment in ReceiverTypeData
In-Reply-To: <6e26665d-b687-4ef8-619f-b398eca63b2a@oracle.com>
References: <6e26665d-b687-4ef8-619f-b398eca63b2a@oracle.com>
Message-ID: <6e307bee-3205-b10c-5120-a68ef56c6f5e@oracle.com>

Yes, the comment change is correct.

Thank you for poining our previous discussion.

Vladimir

On 9/27/16 5:54 AM, Roland Schatz wrote:
> Hi,
>
> Please review this comment fix:
>
> webrev: http://cr.openjdk.java.net/~rschatz/JDK-8166781/webrev.00/
> issue: https://bugs.openjdk.java.net/browse/JDK-8166781
>
> According to my reading of the code, the comment should now agree with
> the code.
> But I don't pretend to really understand that code. It would be nice if
> someone who knows about the profiling code could confirm that's actually
> true ;)
>
> See also previous thread about that issue:
> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-August/024105.html
>
>
> Thanks,
> Roland
>

From rwestrel at redhat.com  Thu Sep 29 13:29:43 2016
From: rwestrel at redhat.com (Roland Westrelin)
Date: Thu, 29 Sep 2016 15:29:43 +0200
Subject: RFR(S): 8166836: Elimination of clone's ArrayCopyNode may make
	compilation fail silently
In-Reply-To: <182f42a0-caca-ea5b-0e7d-3ebaff5f1bc1@oracle.com>
References: <dk6eg448j8k.fsf@rwestrel.remote.csb>
	<182f42a0-caca-ea5b-0e7d-3ebaff5f1bc1@oracle.com>
Message-ID: <dk61t0296m0.fsf@rwestrel.remote.csb>


>> http://cr.openjdk.java.net/~roland/8166836/webrev.00/
>
> Looks good.

Thanks for the review, Vladimir.

Roland.

From rwestrel at redhat.com  Thu Sep 29 13:31:51 2016
From: rwestrel at redhat.com (Roland Westrelin)
Date: Thu, 29 Sep 2016 15:31:51 +0200
Subject: RFR(S): 8166836: Elimination of clone's ArrayCopyNode may
	make	compilation fail silently
In-Reply-To: <d435e4a0-b6c4-81b4-f321-ff28f28da335@oracle.com>
References: <dk6eg448j8k.fsf@rwestrel.remote.csb>
	<182f42a0-caca-ea5b-0e7d-3ebaff5f1bc1@oracle.com>
	<d435e4a0-b6c4-81b4-f321-ff28f28da335@oracle.com>
Message-ID: <dk6y42a7ry0.fsf@rwestrel.remote.csb>


Thanks for taking a look at this.

> I thought we do that. There are several places in loopnode.cpp where we 
> hit assert if graph is bad.

In my case, the graph becomes unschedulable because of anti dependencies
so only after loop opts. Should I open another bug to check
C.failure_reason() in C2Compiler::compile_method() and abort if the
schedule failed?

Roland.

From alexander.vorobyev at oracle.com  Thu Sep 29 16:30:24 2016
From: alexander.vorobyev at oracle.com (Alexander Vorobyev)
Date: Thu, 29 Sep 2016 19:30:24 +0300
Subject: Request for review: JDK-8145728:
	compiler/cpuflags/TestAESIntrinsicsOnSupportedConfig.java Expected
	message not found:
	'com.sun.crypto.provider.AESCrypt::(implEncryptBlock|implDecryptBlock)
	([0-9]+ bytes) (intrinsic) not found on supported platfroms
In-Reply-To: <542E8041.1010101@oracle.com>
References: <542E8041.1010101@oracle.com>
Message-ID: <3f1a7b1e-1ec1-6af6-5b38-84eae3ba4d40@oracle.com>


Hi All,

I'd like review for JDK-8145728 
(https://bugs.openjdk.java.net/browse/JDK-8145728)

Judging by the test results, test fails with specific compiler options: 
-XX:+TieredCompilation -XX:TieredStopAtLevel=N, where N<4. In this case 
C2 is not used and we are not able to see intrinsics usage in the test 
log. So such configuration is not valid for this test and should not be 
used. Supposed fix is to prevent this test from accepting such options.

"@requires" tag was added:
@requires vm.opt.TieredStopAtLevel == null | vm.opt.TieredStopAtLevel == 4


Here is webrev:
http://cr.openjdk.java.net/~avorobye/8145728/webrew.00/


Thanks,
Alexander


From vladimir.kozlov at oracle.com  Thu Sep 29 16:38:19 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 29 Sep 2016 09:38:19 -0700
Subject: Request for review: JDK-8145728:
	compiler/cpuflags/TestAESIntrinsicsOnSupportedConfig.java Expected
	message not found:
	'com.sun.crypto.provider.AESCrypt::(implEncryptBlock|implDecryptBlock)
	([0-9]+ bytes) (intrinsic) not found on supported platfroms
In-Reply-To: <3f1a7b1e-1ec1-6af6-5b38-84eae3ba4d40@oracle.com>
References: <542E8041.1010101@oracle.com>
	<3f1a7b1e-1ec1-6af6-5b38-84eae3ba4d40@oracle.com>
Message-ID: <4949ae4e-d2f5-f09b-7c8c-ea99cc61351e@oracle.com>

Looks good. Did you run all compiler/cpuflags tests to verify that we 
don't need to fix other tests too?

Thanks,
Vladimir

On 9/29/16 9:30 AM, Alexander Vorobyev wrote:
>
> Hi All,
>
> I'd like review for JDK-8145728
> (https://bugs.openjdk.java.net/browse/JDK-8145728)
>
> Judging by the test results, test fails with specific compiler options:
> -XX:+TieredCompilation -XX:TieredStopAtLevel=N, where N<4. In this case
> C2 is not used and we are not able to see intrinsics usage in the test
> log. So such configuration is not valid for this test and should not be
> used. Supposed fix is to prevent this test from accepting such options.
>
> "@requires" tag was added:
> @requires vm.opt.TieredStopAtLevel == null | vm.opt.TieredStopAtLevel == 4
>
>
> Here is webrev:
> http://cr.openjdk.java.net/~avorobye/8145728/webrew.00/
>
>
> Thanks,
> Alexander
>
>
>

From vladimir.kozlov at oracle.com  Thu Sep 29 16:44:51 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 29 Sep 2016 09:44:51 -0700
Subject: RFR(S): 8166836: Elimination of clone's ArrayCopyNode may make
	compilation fail silently
In-Reply-To: <dk6y42a7ry0.fsf@rwestrel.remote.csb>
References: <dk6eg448j8k.fsf@rwestrel.remote.csb>
	<182f42a0-caca-ea5b-0e7d-3ebaff5f1bc1@oracle.com>
	<d435e4a0-b6c4-81b4-f321-ff28f28da335@oracle.com>
	<dk6y42a7ry0.fsf@rwestrel.remote.csb>
Message-ID: <0b65059d-d5c1-9451-2cc9-4f8d7c00ae9d@oracle.com>

Yea, by all means.

Thanks,
Vladimir

On 9/29/16 6:31 AM, Roland Westrelin wrote:
>
> Thanks for taking a look at this.
>
>> I thought we do that. There are several places in loopnode.cpp where we
>> hit assert if graph is bad.
>
> In my case, the graph becomes unschedulable because of anti dependencies
> so only after loop opts. Should I open another bug to check
> C.failure_reason() in C2Compiler::compile_method() and abort if the
> schedule failed?
>
> Roland.
>

From tom.rodriguez at oracle.com  Thu Sep 29 18:25:15 2016
From: tom.rodriguez at oracle.com (Tom Rodriguez)
Date: Thu, 29 Sep 2016 11:25:15 -0700
Subject: RFR(S) 8166869: [JVMCI] record metadata relocations for metadata
	references
Message-ID: <73408CA4-EA92-41B4-9499-AE89F4F2F27B@oracle.com>

http://cr.openjdk.java.net/~never/8166869/webrev
https://bugs.openjdk.java.net/browse/JDK-8166869

JVMCI records metadata references in the metadata section, so scanning of referenced metadata will work properly but it never actually creates a relocation in code or constants section. This means the disassembly is a little less readable than it might be.  This adds the creation of the appropriate relocation.  Tested by inspection of assembly printing on sparc and x86.

tom

From vladimir.kozlov at oracle.com  Thu Sep 29 18:30:08 2016
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 29 Sep 2016 11:30:08 -0700
Subject: RFR(S) 8166869: [JVMCI] record metadata relocations for metadata
	references
In-Reply-To: <73408CA4-EA92-41B4-9499-AE89F4F2F27B@oracle.com>
References: <73408CA4-EA92-41B4-9499-AE89F4F2F27B@oracle.com>
Message-ID: <650ac4af-ea41-27b7-d52d-49be3e282c31@oracle.com>

Looks good.

Thanks,
Vladimir

On 9/29/16 11:25 AM, Tom Rodriguez wrote:
> http://cr.openjdk.java.net/~never/8166869/webrev
> https://bugs.openjdk.java.net/browse/JDK-8166869
>
> JVMCI records metadata references in the metadata section, so scanning of referenced metadata will work properly but it never actually creates a relocation in code or constants section. This means the disassembly is a little less readable than it might be.  This adds the creation of the appropriate relocation.  Tested by inspection of assembly printing on sparc and x86.
>
> tom
>

From tom.rodriguez at oracle.com  Fri Sep 30 00:24:55 2016
From: tom.rodriguez at oracle.com (Tom Rodriguez)
Date: Thu, 29 Sep 2016 17:24:55 -0700
Subject: RFR 8166929: [JVMCI] Expose decompile counts in MDO
Message-ID: <54C73C2D-F44F-491E-92C3-79DE73CE7B8F@oracle.com>

http://cr.openjdk.java.net/~never/8166929/webrev
https://bugs.openjdk.java.net/browse/JDK-8166929

This is a minor API addition to expose some of the top-level MDO decompile and recompile counts.  It?s necessary to detect recompilation pathologies.  Tested by printing MDOs from JVMCI.  I also fixed a few problems I discovered with the formatting of the MDO printed form.

tom

From vitalyd at gmail.com  Fri Sep 30 02:16:52 2016
From: vitalyd at gmail.com (Vitaly Davidovich)
Date: Thu, 29 Sep 2016 22:16:52 -0400
Subject: RFR 8166929: [JVMCI] Expose decompile counts in MDO
In-Reply-To: <54C73C2D-F44F-491E-92C3-79DE73CE7B8F@oracle.com>
References: <54C73C2D-F44F-491E-92C3-79DE73CE7B8F@oracle.com>
Message-ID: <CAHjP37HAuG9vkRN5JA0mgTUOXOOJxQLh-5qdnrNiBp-QAh+3Lg@mail.gmail.com>

Quick fly-by comment: HotSpotMethodData::toString should use %d for
overflow recompiles count printing, like the other counters.

Thanks

On Thursday, September 29, 2016, Tom Rodriguez <tom.rodriguez at oracle.com>
wrote:

> http://cr.openjdk.java.net/~never/8166929/webrev
> https://bugs.openjdk.java.net/browse/JDK-8166929
>
> This is a minor API addition to expose some of the top-level MDO decompile
> and recompile counts.  It?s necessary to detect recompilation pathologies.
> Tested by printing MDOs from JVMCI.  I also fixed a few problems I
> discovered with the formatting of the MDO printed form.
>
> tom


-- 
Sent from my phone
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160929/10e9cfb3/attachment.html>

From rwestrel at redhat.com  Fri Sep 30 07:52:44 2016
From: rwestrel at redhat.com (Roland Westrelin)
Date: Fri, 30 Sep 2016 09:52:44 +0200
Subject: RFR(S): 8166836: Elimination of clone's ArrayCopyNode may make
	compilation fail silently
In-Reply-To: <dk6eg448j8k.fsf@rwestrel.remote.csb>
References: <dk6eg448j8k.fsf@rwestrel.remote.csb>
Message-ID: <dk6vaxd7rjn.fsf@rwestrel.remote.csb>


> http://cr.openjdk.java.net/~roland/8166836/webrev.00/

I need a sponsor for this.

Roland.

From zoltan.majo at oracle.com  Fri Sep 30 08:16:33 2016
From: zoltan.majo at oracle.com (=?UTF-8?B?Wm9sdMOhbiBNYWrDsw==?=)
Date: Fri, 30 Sep 2016 10:16:33 +0200
Subject: RFR(S): 8166836: Elimination of clone's ArrayCopyNode may make
	compilation fail silently
In-Reply-To: <dk6vaxd7rjn.fsf@rwestrel.remote.csb>
References: <dk6eg448j8k.fsf@rwestrel.remote.csb>
	<dk6vaxd7rjn.fsf@rwestrel.remote.csb>
Message-ID: <22cacbd4-052f-72ef-f671-7b905c3e47dd@oracle.com>

Hi Roland,


On 09/30/2016 09:52 AM, Roland Westrelin wrote:
>> http://cr.openjdk.java.net/~roland/8166836/webrev.00/
> I need a sponsor for this.

I'll take care of it.

Best regards,


Zoltan

>
> Roland.


From HORII at jp.ibm.com  Fri Sep 30 10:17:05 2016
From: HORII at jp.ibm.com (Hiroshi H Horii)
Date: Fri, 30 Sep 2016 10:17:05 +0000
Subject: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for
	ppc64
In-Reply-To: <f52703e8-67b9-0852-540e-a31e5dca1c1e@oracle.com>
References: <201604221228.u3MCSXCL020021@d19av07.sagamino.japan.ibm.com>
	<571A1FA3.9030006@oracle.com>
	<201604250709.u3P79jwN024101@d19av07.sagamino.japan.ibm.com>
	<1574d9e7-c9cd-b1e8-e9a1-d63630713724@oracle.com>
	<201605061011.u46ABZDR015108@d19av07.sagamino.japan.ibm.com>
	<848a70ad-00b3-b742-fa4e-87dc0124e0e3@oracle.com>
	<347b1733-fbbc-b65b-5417-7be52a0b5d68@oracle.com>
	<0e47ed4857d94f9bbd99b0738bf1708a@DEWDFE13DE14.global.corp.sap>
	<f5826c30-0e12-8af9-9f78-3e7fd173b899@oracle.com>
	<OFE8C20C07.4A5437DD-ON4925803D.0040476D-4925803D.0041F53D@notes.na.collabserv.com>
	<CAP_pwnWpE9OhRA-XxTjKAq4T2rLjnLXLDomkBvAPdJ1G8XEjQw@mail.gmail.com>
	<f52703e8-67b9-0852-540e-a31e5dca1c1e@oracle.com>
Message-ID: <OFA2287681.8B1427FA-ON4925803E.0035621E-4925803E.00387EBB@notes.na.collabserv.com>

Dear David, and Dan,

Thank you for your comments.

> In hotspot/src/share/vm/gc/parallel/psPromotionManager.inline.hpp:
> 266 the log line reads data from the forwardee even when the CAS 
> fails. I believe those reads will be unsafe without barriers after 
> the copy of the content of the object.
> hotspot/src/share/vm/gc/parallel/psPromotionManager.inline.hpp:288 
> same problem as in line 266

Can we use o->size() or new_obj_size instead of new_obj->size()?

> If you feel that the use of new_obj->size() is potentially unsafe then 
> the fact we return new_obj means that any use of new_obj by the caller 
> may also potentially be unsafe.

In my understanding, while copying objects to a survivor space, if a 
thread creates a new_obj and sets a pointer with CAS, the other threads 
can touch the new_obj after the thread calls push_contents(new_obj) (Line: 
239). In push_contents, OrderAccess::release_store is called before 
pushing the object as a task into a deque of workstealing 
(taskqueue.inline.hpp). If the other thread reads the task, all of copy 
for new_obj is safe.

Thank you for your helps again. I may be misunderstanding or missing 
something critical. Any comments and claims are always appreciated. 

Regards,
Hiroshi
-----------------------
Hiroshi Horii, Ph.D.
IBM Research - Tokyo


David Holmes <david.holmes at oracle.com> wrote on 09/30/2016 07:16:16:

> From: David Holmes <david.holmes at oracle.com>
> To: Carsten Varming <varming at gmail.com>, Hiroshi H Horii/Japan/IBM at IBMJP
> Cc: Tim Ellison <Tim_Ellison at uk.ibm.com>, "ppc-aix-port-
> dev at openjdk.java.net" <ppc-aix-port-dev at openjdk.java.net>, "hotspot-
> runtime-dev at openjdk.java.net" <hotspot-runtime-
> dev at openjdk.java.net>, "hotspot-gc-dev at openjdk.java.net" <hotspot-
> gc-dev at openjdk.java.net>, hotspot-compiler-dev <hotspot-compiler-
> dev-bounces at openjdk.java.net>
> Date: 09/30/2016 07:17
> Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and 
> copy_to_survivor for ppc64
> 
> On 30/09/2016 12:47 AM, Carsten Varming wrote:
> > Dear Hiroshi,
> >
> > In hotspot/src/share/vm/gc/parallel/psPromotionManager.inline.hpp:266
> > the log line reads data from the forwardee even when the CAS fails. I
> > believe those reads will be unsafe without barriers after the copy of
> > the content of the object.
> 
> I find it extremely hard to reason about a barrier-less cmpxchg in 
general.
> 
> If you feel that the use of new_obj->size() is potentially unsafe then 
> the fact we return new_obj means that any use of new_obj by the caller 
> may also potentially be unsafe.
> 
> David
> -----
> 
> > hotspot/src/share/vm/gc/parallel/psPromotionManager.inline.hpp:288 
same
> > problem as in line 266
> >
> > I would argue that the logging should only happen if the thread
> > successfully copied the object and CAS failures should be logged
> > separately without reading data from the forwardee.
> >
> > BTW, unrelated to your change: It seems like the logging in line 266
> > should be guarded by something like "if (log_develop_is_enabled(Trace,
> > gc, scavenge)" like the logging in line 288.
> >
> > Carsten
> >
> > On Thu, Sep 29, 2016 at 8:00 AM, Hiroshi H Horii <HORII at jp.ibm.com
> > <mailto:HORII at jp.ibm.com>> wrote:
> >
> >     Hi all,
> >
> >     Can I please request reviews for a change for 8154736 that improve
> >     copy_to_survivor performance of ppc64 and aarch64?
> >     If possible, I would like to include this change into jdk9.
> >
> >     8154736 includes two changes, cmpxchg and copy_to_suvivor, and the
> >     former
> >     was resolved as 8155949.
> >     Now, I would like to ask a review for the remaining, 
copy_to_suvivor
> >     change.
> >
> >     webrev:
> >     
http://cr.openjdk.java.net/~mdoerr/8154736_copy_to_survivor/webrev.01/
> >     <
http://cr.openjdk.java.net/~mdoerr/8154736_copy_to_survivor/webrev.01/>
> >     JIRA: https://bugs.openjdk.java.net/browse/JDK-8154736
> >     <https://bugs.openjdk.java.net/browse/JDK-8154736>
> >
> >     I tested this change with SPECjbb2013. Also, I re-check that 
relaxed
> >     cmpxchg is available for changing forwarding pointers. However, 
because
> >     this change is sensitive, we need more reviews not only from
> >     compiler-dev,
> >     but also from gc-dev.
> >
> >     Regards,
> >     Hiroshi
> >     -----------------------
> >     Hiroshi Horii, Ph.D.
> >     IBM Research - Tokyo
> >
> >
> >
> >
> >     From:   David Holmes <david.holmes at oracle.com
> >     <mailto:david.holmes at oracle.com>>
> >     To:     "Doerr, Martin" <martin.doerr at sap.com
> >     <mailto:martin.doerr at sap.com>>, Hiroshi H
> >     Horii/Japan/IBM at IBMJP
> >     Cc:     Tim Ellison <Tim_Ellison at uk.ibm.com
> >     <mailto:Tim_Ellison at uk.ibm.com>>,
> >     "ppc-aix-port-dev at openjdk.java.net
> >     <mailto:ppc-aix-port-dev at openjdk.java.net>"
> >     <ppc-aix-port-dev at openjdk.java.net
> >     <mailto:ppc-aix-port-dev at openjdk.java.net>>,
> >     "hotspot-gc-dev at openjdk.java.net
> >     <mailto:hotspot-gc-dev at openjdk.java.net>"
> >     <hotspot-gc-dev at openjdk.java.net
> >     <mailto:hotspot-gc-dev at openjdk.java.net>>,
> >     "hotspot-runtime-dev at openjdk.java.net
> >     <mailto:hotspot-runtime-dev at openjdk.java.net>"
> >     <hotspot-runtime-dev at openjdk.java.net
> >     <mailto:hotspot-runtime-dev at openjdk.java.net>>
> >     Date:   05/10/2016 19:31
> >     Subject:        Re: RFR(M): 8154736: enhancement of cmpxchg and
> >     copy_to_survivor for ppc64
> >
> >
> >
> >     On 10/05/2016 7:41 PM, Doerr, Martin wrote:
> >     > Hi David,
> >     >
> >     > thank you very much for testing the other platforms.
> >     >
> >     > Here's an updated webrev:
> >     > 
http://cr.openjdk.java.net/~mdoerr/8155949_relaxed_cas/webrev.01/
> >     <http://cr.openjdk.java.net/~mdoerr/8155949_relaxed_cas/webrev.01/
>
> >
> >     Thanks. Second test run on its way.
> >
> >     David
> >     -----
> >
> >     > Best regards,
> >     > Martin
> >     >
> >     > -----Original Message-----
> >     > From: hotspot-runtime-dev [
> >     mailto:hotspot-runtime-dev-bounces at openjdk.java.net
> >     <mailto:hotspot-runtime-dev-bounces at openjdk.java.net>] On Behalf 
Of
> >     David
> >     Holmes
> >     > Sent: Dienstag, 10. Mai 2016 11:11
> >     > To: Hiroshi H Horii <HORII at jp.ibm.com <mailto:HORII at jp.ibm.com>>
> >     > Cc: Tim Ellison <Tim_Ellison at uk.ibm.com
> >     <mailto:Tim_Ellison at uk.ibm.com>>;
> >     ppc-aix-port-dev at openjdk.java.net
> >     <mailto:ppc-aix-port-dev at openjdk.java.net>;
> >     hotspot-gc-dev at openjdk.java.net
> >     <mailto:hotspot-gc-dev at openjdk.java.net>;
> >     hotspot-runtime-dev at openjdk.java.net
> >     <mailto:hotspot-runtime-dev at openjdk.java.net>
> >     > Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and
> >     copy_to_survivor for ppc64
> >     >
> >     > The fix seems incomplete for solaris:
> >     >
> >     > make/Main.gmk:232: recipe for target 'hotspot' failed
> >     >
> >     "/opt/jprt/T/P1/073516.daholme/s/hotspot/src/os_cpu/
> solaris_x86/vm/atomic_solaris_x86.inline.hpp",
> >     > line 124: Error: Too many arguments in call to
> >     > "_Atomic_cmpxchg_long(long, volatile long*, long)".
> >     >
> >     "/opt/jprt/T/P1/073516.daholme/s/hotspot/src/os_cpu/
> solaris_x86/vm/atomic_solaris_x86.inline.hpp",
> >     > line 128: Error: Too many arguments in call to
> >     > "_Atomic_cmpxchg_long(long, volatile long*, long)".
> >     >
> >     > David
> >     >
> >     > On 10/05/2016 5:34 PM, David Holmes wrote:
> >     >> Hi Hiroshi,
> >     >>
> >     >> On 6/05/2016 8:11 PM, Hiroshi H Horii wrote:
> >     >>> Hi David,
> >     >>>
> >     >>> Thank you for your comments.
> >     >>>
> >     >>> As Martin suggested me, I would like to separate this proposal 
to
> >     >>>   - relaxing memory order of cmpxchg
> >     >>>   - improvement of copy_to_survivior with relaxed cmpxchg
> >     >>> and discuss the former first.
> >     >>>
> >     >>> Martin thankfully created a new webrev that include a change 
of
> >     cmpxchg.
> >     >>>
> >     http://cr.openjdk.java.net/~mdoerr/8155949_relaxed_cas/webrev.00/
> >     <http://cr.openjdk.java.net/~mdoerr/8155949_relaxed_cas/webrev.00/
>
> >     >>> He has already tested it with AIX, linuxx86_64, linuxppc64le 
and
> >     >>> darwinintel64.
> >     >>> (Please tell me if I need to send a new mail for this PFR)
> >     >>
> >     >> Please do as it will be simpler to track that way.
> >     >>
> >     >>>> What I would prefer to see is an additional memory_order 
value
> >     (such
> >     as
> >     >>>> memory_order_ignored) which is the default for all methods 
declared
> >     to
> >     >>>> take a memory_order parameter.
> >     >>>
> >     >>> We added simple enum to specify memory order in atomic.hpp as
> >     follows.
> >     >>>
> >     >>> typedef enum cmpxchg_cmpxchg_memory_order {
> >     >>>   memory_order_relaxed,
> >     >>>   memory_order_conservative
> >     >>> } cmpxchg_memory_order;
> >     >>>
> >     >>> All of cmpxchg functions have an argument of 
cmpxchg_memory_order
> >     >>> with a default value memory_order_conservative that uses the 
same
> >     >>> semantics with the existing cmpxchg and requires no change for 
the
> >     >>> existing
> >     >>> callers. If you think "memory_order_ignored" is better than
> >     >>> "memory_order_conservative", I will be happy to modify this 
change.
> >     >>> (I just thought, "ignored" may resemble "relaxed" and may make
> >     >>> people who are familiar with C++11's memory semantics 
confused.
> >     >>> I would like to know thoughts of native speakers.)
> >     >>
> >     >> That is fine by me. I don't think "ignored" would be confused 
with
> >     >> "relaxed", but "conservative" is fine.
> >     >>
> >     >> I will run the patch through our internal build system while 
you
> >     prepare
> >     >> the updated RFR. My only concern is "unused argument" warnings
> >     from the
> >     >> compiler. :)
> >     >>
> >     >> We are quickly running into a hard deadline with Feature 
Complete
> >     >> however - possibly less than 24 hours - for hotspot changes. If 
this
> >     >> doesn't get in in time I will see if I can shepherd it through 
the
> >     >> approval process.
> >     >>
> >     >> Thanks,
> >     >> David
> >     >>
> >     >>
> >     >>> Regards,
> >     >>> Hiroshi
> >     >>> -----------------------
> >     >>> Hiroshi Horii, Ph.D.
> >     >>> IBM Research - Tokyo
> >     >>>
> >     >>>
> >     >>> David Holmes <david.holmes at oracle.com
> >     <mailto:david.holmes at oracle.com>> wrote on 05/04/2016 14:55:29:
> >     >>>
> >     >>>> From: David Holmes <david.holmes at oracle.com
> >     <mailto:david.holmes at oracle.com>>
> >     >>>> To: Hiroshi H Horii/Japan/IBM at IBMJP
> >     >>>> Cc: hotspot-gc-dev at openjdk.java.net
> >     <mailto:hotspot-gc-dev at openjdk.java.net>, hotspot-runtime-
> >     >>>> dev at openjdk.java.net <mailto:dev at openjdk.java.net>,
> >     ppc-aix-port-dev at openjdk.java.net
> >     <mailto:ppc-aix-port-dev at openjdk.java.net>, Tim Ellison
> >     >>>> <Tim_Ellison at uk.ibm.com <mailto:Tim_Ellison at uk.ibm.com>>,
> >     Volker Simonis <volker.simonis at gmail.com
> >     <mailto:volker.simonis at gmail.com>>,
> >     >>>> "Doerr, Martin" <martin.doerr at sap.com
> >     <mailto:martin.doerr at sap.com>>, "Lindenmaier, Goetz"
> >     >>>> <goetz.lindenmaier at sap.com <mailto:goetz.lindenmaier at sap.com
>>
> >     >>>> Date: 05/04/2016 14:57
> >     >>>> Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and
> >     >>>> copy_to_survivor for ppc64
> >     >>>>
> >     >>>> Hi Hiroshi,
> >     >>>>
> >     >>>> Sorry for the delay on getting back to this.
> >     >>>>
> >     >>>> On 25/04/2016 5:09 PM, Hiroshi H Horii wrote:
> >     >>>>> Hi David,
> >     >>>>>
> >     >>>>> Thank you for your comments and questions.
> >     >>>>>
> >     >>>>>> 1. Are the current cmpxchg semantics exactly the same as
> >     >>>>>> memory_order_seq_cst?
> >     >>>>>
> >     >>>>> This is very good question..
> >     >>>>>
> >     >>>>> I guess, cmpxchg needs a more conservative constraint for 
memory
> >     >>> ordering
> >     >>>>> than C++11, to add sync after a compare-and-exchange 
operation.
> >     >>>>>
> >     >>>>> Could someone give comments or thoughts?
> >     >>>>
> >     >>>> I don't want to comment on the comparison with C++11. What I 
would
> >     >>>> prefer to see is an additional memory_order value (such as
> >     >>>> memory_order_ignored) which is the default for all methods 
declared
> >     to
> >     >>>> take a memory_order parameter. That way existing
> >     implementations are
> >     >>>> clearly ignoring the memory_order attribute and there is no
> >     potential
> >     >>>> for confusion as to whether the existing implementations 
equate to
> >     >>>> memory_order_seq_cst or not.
> >     >>>>
> >     >>>> That said, I'm not sure it makes sense to add the 
memory_order
> >     parameter
> >     >>>> to all methods with "cas" in their name, e.g.
> >     oopDesc::cas_set_mark,
> >     >>>> oopDesc::cas_forward_to, unless those methods can sensibly be
> >     called
> >     >>>> with any value for memory_order - which seems highly 
unlikely.
> >     Perhaps
> >     >>>> those methods should identify the weakest form of 
memory_order they
> >     >>>> support and that should be hard-wired into them?
> >     >>>>
> >     >>>> Thanks,
> >     >>>> David
> >     >>>>
> >     >>>>> memory_order_seq_cst is defined as
> >     >>>>>     "Any operation with this memory order is both an acquire
> >     >>> operation and
> >     >>>>>      a release operation, plus a single total order exists 
in
> >     which
> >     >>>> all
> >     >>>>> threads
> >     >>>>>      observe all modifications (see below) in the same 
order."
> >     >>>>> (http://en.cppreference.com/w/cpp/atomic/memory_order
> >     <http://en.cppreference.com/w/cpp/atomic/memory_order>)
> >     >>>>>
> >     >>>>> In my environment, g++ and xlc generate following assemblies 
on
> >     >>>> ppc64le.
> >     >>>>> (interestingly, they generates the same assemblies for any
> >     >>>> memory_order)
> >     >>>>>
> >     >>>>> g++ (4.9.2)
> >     >>>>>     100008a4:   ac 04 00 7c     sync
> >     >>>>>     100008a8:   28 50 20 7d     lwarx   r9,0,r10
> >     >>>>>     100008ac:   00 18 09 7c     cmpw    r9,r3
> >     >>>>>     100008b0:   0c 00 c2 40     bne-    100008bc
> >     >>>>>     100008b4:   2d 51 80 7c     stwcx.  r4,0,r10
> >     >>>>>     100008b8:   f0 ff c2 40     bne-    100008a8
> >     >>>>>     100008bc:   2c 01 00 4c     isync
> >     >>>>>
> >     >>>>> xlc (13.1.3)
> >     >>>>>     10000888:   ac 04 00 7c     sync
> >     >>>>>     1000088c:   28 28 c0 7c     lwarx   r6,0,r5
> >     >>>>>     10000890:   40 00 26 7c     cmpld   r6,r0
> >     >>>>>     10000894:   0c 00 82 40     bne     100008a0
> >     >>>>>     10000898:   2d 29 80 7c     stwcx.  r4,0,r5
> >     >>>>>     1000089c:   f0 ff e2 40     bne+    1000088c
> >     >>>>>     100008a0:   2c 01 00 4c     isync
> >     >>>>>
> >     >>>>> On the other hand, the current OpenJDK generates following
> >     assemblies.
> >     >>>>>
> >     >>>>>     508:   ac 04 00 7c     sync
> >     >>>>>     50c:   00 00 5c e9     ld      r10,0(r28)
> >     >>>>>     510:   00 50 3b 7c     cmpd    r27,r10
> >     >>>>>     514:   1c 00 c2 40     bne-    530
> >     >>>>>     518:   a8 40 5c 7d     ldarx   r10,r28,r8
> >     >>>>>     51c:   00 50 3b 7c     cmpd    r27,r10
> >     >>>>>     520:   10 00 c2 40     bne-    530
> >     >>>>>     524:   ad 41 3c 7d     stdcx.  r9,r28,r8
> >     >>>>>     528:   f0 ff c2 40     bne-    518
> >     >>>>>     52c:   ac 04 00 7c     sync
> >     >>>>>     530:   00 50 bb 7f     ...
> >     >>>>>
> >     >>>>> Though we can ignore 50c-514 (because they are a duplicated 
guard
> >     >>>>> condition),
> >     >>>>> the last sync instruction (52c) makes cmpxchg more strict 
than
> >     >>>>> memory_order_seq_cst.
> >     >>>>>
> >     >>>>> In some cases, the last sync is necessary when this thread 
must be
> >     >>>> able
> >     >>>>> to read
> >     >>>>> all of the changes in the other threads while executing from
> >     508 to
> >     >>>> 530
> >     >>>>> (that processes compare-and-exchange).
> >     >>>>>
> >     >>>>>> 2. Has there been a discussion already, establishing that 
the
> >     >>>> modified
> >     >>>>>> GC code can indeed use memory_order_relaxed? Otherwise who 
is
> >     >>>>>> postulating that and based on what evidence?
> >     >>>>>
> >     >>>>> Volker and his colleagues have investigated the current GC 
codes
> >     >>>>> according to this.
> >     >>>>>
> >     http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-
> >     <http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016->
> >     >>>> April/019079.html
> >     >>>>> However, I believe, we need comments of other GC expertsto 
change
> >     >>>>> the shared codes.
> >     >>>>>
> >     >>>>> Regards,
> >     >>>>> Hiroshi
> >     >>>>> -----------------------
> >     >>>>> Hiroshi Horii, Ph.D.
> >     >>>>> IBM Research - Tokyo
> >     >>>>>
> >     >>>>>
> >     >>>>> David Holmes <david.holmes at oracle.com
> >     <mailto:david.holmes at oracle.com>> wrote on 04/22/2016 21:57:07:
> >     >>>>>
> >     >>>>>> From: David Holmes <david.holmes at oracle.com
> >     <mailto:david.holmes at oracle.com>>
> >     >>>>>> To: Hiroshi H Horii/Japan/IBM at IBMJP, hotspot-runtime-
> >     >>>>>> dev at openjdk.java.net <mailto:dev at openjdk.java.net>,
> >     hotspot-gc-dev at openjdk.java.net <
mailto:hotspot-gc-dev at openjdk.java.net>
> >     >>>>>> Cc: Tim Ellison <Tim_Ellison at uk.ibm.com
> >     <mailto:Tim_Ellison at uk.ibm.com>>,
> >     >>>>> ppc-aix-port-dev at openjdk.java.net
> >     <mailto:ppc-aix-port-dev at openjdk.java.net>
> >     >>>>>> Date: 04/22/2016 21:58
> >     >>>>>> Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and
> >     >>>>>> copy_to_survivor for ppc64
> >     >>>>>>
> >     >>>>>> Hi Hiroshi,
> >     >>>>>>
> >     >>>>>> Two initial questions:
> >     >>>>>>
> >     >>>>>> 1. Are the current cmpxchg semantics exactly the same as
> >     >>>>>> memory_order_seq_cst?
> >     >>>>>>
> >     >>>>>> 2. Has there been a discussion already, establishing that 
the
> >     >>>> modified
> >     >>>>>> GC code can indeed use memory_order_relaxed? Otherwise who 
is
> >     >>>>>> postulating that and based on what evidence?
> >     >>>>>>
> >     >>>>>> Missing memory barriers have caused very difficult to track 
down
> >     >>> bugs in
> >     >>>>>> the past - very rare race conditions. So any relaxation 
here has
> >     >>>> to be
> >     >>>>>> done with extreme confidence.
> >     >>>>>>
> >     >>>>>> Thanks,
> >     >>>>>> David
> >     >>>>>>
> >     >>>>>> On 22/04/2016 10:28 PM, Hiroshi H Horii wrote:
> >     >>>>>>> Dear all:
> >     >>>>>>>
> >     >>>>>>> Can I please request reviews for the following change?
> >     >>>>>>>
> >     >>>>>>> Code change:
> >     >>>>>>>
> >     >>>
> >     
http://cr.openjdk.java.net/~mdoerr/8154736_copy_to_survivor/webrev.00/
> >     <
http://cr.openjdk.java.net/~mdoerr/8154736_copy_to_survivor/webrev.00/>
> >     >>>>>>> (I initially created and Martin enhanced so much)
> >     >>>>>>>
> >     >>>>>>> This change follows the discussion started from this mail.
> >     >>>>>>>
> >     http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-
> >     <http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016->
> >     >>>>>> April/018960.html
> >     >>>>>>>
> >     >>>>>>> Description:
> >     >>>>>>> This change provides relaxed compare-and-exchange by 
introducing
> >     >>>>>>> similar semantics of C++ atomic memory operators, enum
> >     >>>> memory_order.
> >     >>>>>>> As described in atomic_linux_ppc.inline.hpp, the current
> >     >>>>> implementation of
> >     >>>>>>> cmpxchg is fence_cmpxchg_acquire. This implementation is 
useful
> >     for
> >     >>>>>>> general purposes because twice calls of sync before and 
after
> >     >>>>> cmpxchg will
> >     >>>>>>> provide strict consistency. However, they sometimes cause
> >     overheads
> >     >>>>>>> because
> >     >>>>>>> sync instructions are very expensive in the current POWER 
chip
> >     >>> design.
> >     >>>>>>> In addition, for the other platforms, such as aarch64, 
this
> >     strict
> >     >>>>>>> semantics
> >     >>>>>>> may cause some overheads (according to the Andrew's mail).
> >     >>>>>>>
> >     http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-
> >     <http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016->
> >     >>>>>> April/019073.html
> >     >>>>>>>
> >     >>>>>>> With this change, callers can explicitly specify 
constraints of
> >     >>> memory
> >     >>>>>>> ordering
> >     >>>>>>> for cmpxchg with an additional parameter, memory_order 
order.
> >     >>>>>>>
> >     >>>>>>> typedef enum memory_order {
> >     >>>>>>>    memory_order_relaxed,
> >     >>>>>>>    memory_order_consume,
> >     >>>>>>>    memory_order_acquire,
> >     >>>>>>>    memory_order_release,
> >     >>>>>>>    memory_order_acq_rel,
> >     >>>>>>>    memory_order_seq_cst
> >     >>>>>>> } memory_order;
> >     >>>>>>>
> >     >>>>>>> Because the default value of the parameter is
> >     memory_order_seq_cst,
> >     >>>>>>> existing codes can use the same semantics of cmpxchg 
without any
> >     >>>>>>> modification. The relaxed cmpxchg is implemented only on 
ppc
> >     >>>>>>> in this changeset. Therefore, the behavior on the other
> >     platforms
> >     >>> will
> >     >>>>>>> not be changed with this changeset.
> >     >>>>>>>
> >     >>>>>>> In addition, with the new parameter of cmpxchg, this 
change
> >     >>>> improves
> >     >>>>>>> performance of copy_to_survivor in the parallel GC.
> >     >>>>>>> copy_to_survivor changes forward pointers by using 
cmpxchg. This
> >     >>>>>>> operation doesn't require any sync instructions.  A 
pointer is
> >     >>> changed
> >     >>>>>>> at most once in a GC and when cmpxchg fails, the latest
> >     pointer is
> >     >>>>>>> available for the caller. cas_set_mark and cas_forward_to 
are
> >     >>> extended
> >     >>>>>>> with an additional memory_order parameter as cmpxchg and
> >     >>>>> copy_to_survivor
> >     >>>>>>> uses memory_order_relaxed to modify the forward pointers.
> >     >>>>>>>
> >     >>>>>>> Summary of source code changes:
> >     >>>>>>>
> >     >>>>>>> * src/share/vm/runtime/atomic.hpp
> >     >>>>>>>       - Defines enum memory_order and adds a parameter to
> >     cmpxchg.
> >     >>>>>>>
> >     >>>>>>> * src/share/vm/runtime/atomic.cpp
> >     >>>>>>> * src/os_cpu/bsd_x86/vm/atomic_bsd_x86.inline.hpp
> >     >>>>>>> * src/os_cpu/bsd_zero/vm/atomic_bsd_zero.inline.hpp
> >     >>>>>>> * 
src/os_cpu/linux_aarch64/vm/atomic_linux_aarch64.inline.hpp
> >     >>>>>>> * src/os_cpu/linux_sparc/vm/atomic_linux_sparc.inline.hpp
> >     >>>>>>> * src/os_cpu/linux_x86/vm/atomic_linux_x86.inline.hpp
> >     >>>>>>> * src/os_cpu/linux_zero/vm/atomic_linux_zero.inline.hpp
> >     >>>>>>> * 
src/os_cpu/solaris_sparc/vm/atomic_solaris_sparc.inline.hpp
> >     >>>>>>> * src/os_cpu/solaris_x86/vm/atomic_solaris_x86.inline.hpp
> >     >>>>>>> * src/os_cpu/windows_x86/vm/atomic_windows_x86.inline.hpp
> >     >>>>>>>       - Added a parameter for each cmpxchg function to 
follow
> >     >>>>>>>          the change of atomic.hpp. Their implementations 
are not
> >     >>>>> changed.
> >     >>>>>>>
> >     >>>>>>> * src/os_cpu/aix_ppc/vm/atomic_aix_ppc.inline.hpp
> >     >>>>>>> * src/os_cpu/linux_ppc/vm/atomic_linux_ppc.inline.hpp
> >     >>>>>>>       - Added a parameter for each cmpxchg function to 
follow
> >     >>>>>>>          the change of atomic.hpp. In addition, 
implementations
> >     >>>>>>>          are changed corresponding to the specified
> >     memory_order.
> >     >>>>>>>
> >     >>>>>>> * src/share/vm/oops/oop.hpp
> >     >>>>>>> * src/share/vm/oops/oop.inline.hpp
> >     >>>>>>>       - Add a memory_order parameter to use relaxed 
cmpxchg in
> >     >>>>>>>          cas_set_mark and cas_forward_to.
> >     >>>>>>>
> >     >>>>>>> * src/share/vm/gc/parallel/psPromotionManager.cpp
> >     >>>>>>> * src/share/vm/gc/parallel/psPromotionManager.inline.hpp
> >     >>>>>>>
> >     >>>>>>> Martin tested this changeset  on linuxx86_64, linuxppc64le 
and
> >     >>>>>>> darwinintel64.
> >     >>>>>>> Though more time is needed to test on the other platform, 
we
> >     would
> >     >>>>> like to
> >     >>>>>>> ask
> >     >>>>>>> reviews and start discussion on this changeset.
> >     >>>>>>> I also tested this changeset with SPECjbb2013 and 
confirmed that
> >     gc
> >     >>>>> pause
> >     >>>>>>> time
> >     >>>>>>> is reduced.
> >     >>>>>>>
> >     >>>>>>> Regards,
> >     >>>>>>> Hiroshi
> >     >>>>>>> -----------------------
> >     >>>>>>> Hiroshi Horii, Ph.D.
> >     >>>>>>> IBM Research - Tokyo
> >     >>>>>>>
> >     >>>>>>>
> >     >>>>>>
> >     >>>>>
> >     >>>>
> >     >>>
> >
> >
> >
> >
> >
> >
> 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160930/d4cdd684/attachment-0001.html>

From david.holmes at oracle.com  Fri Sep 30 11:12:27 2016
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 30 Sep 2016 21:12:27 +1000
Subject: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for
	ppc64
In-Reply-To: <OFA2287681.8B1427FA-ON4925803E.0035621E-4925803E.00387EBB@notes.na.collabserv.com>
References: <201604221228.u3MCSXCL020021@d19av07.sagamino.japan.ibm.com>
	<571A1FA3.9030006@oracle.com>
	<201604250709.u3P79jwN024101@d19av07.sagamino.japan.ibm.com>
	<1574d9e7-c9cd-b1e8-e9a1-d63630713724@oracle.com>
	<201605061011.u46ABZDR015108@d19av07.sagamino.japan.ibm.com>
	<848a70ad-00b3-b742-fa4e-87dc0124e0e3@oracle.com>
	<347b1733-fbbc-b65b-5417-7be52a0b5d68@oracle.com>
	<0e47ed4857d94f9bbd99b0738bf1708a@DEWDFE13DE14.global.corp.sap>
	<f5826c30-0e12-8af9-9f78-3e7fd173b899@oracle.com>
	<OFE8C20C07.4A5437DD-ON4925803D.0040476D-4925803D.0041F53D@notes.na.collabserv.com>
	<CAP_pwnWpE9OhRA-XxTjKAq4T2rLjnLXLDomkBvAPdJ1G8XEjQw@mail.gmail.com>
	<f52703e8-67b9-0852-540e-a31e5dca1c1e@oracle.com>
	<OFA2287681.8B1427FA-ON4925803E.0035621E-4925803E.00387EBB@notes.na.collabserv.com>
Message-ID: <1e40040e-b494-6e1e-00a4-dc130954cebd@oracle.com>

On 30/09/2016 8:17 PM, Hiroshi H Horii wrote:
> Dear David, and Dan,
>
> Thank you for your comments.
>
>> In hotspot/src/share/vm/gc/parallel/psPromotionManager.inline.hpp:
>> 266 the log line reads data from the forwardee even when the CAS
>> fails. I believe those reads will be unsafe without barriers after
>> the copy of the content of the object.
>> hotspot/src/share/vm/gc/parallel/psPromotionManager.inline.hpp:288
>> same problem as in line 266
>
> Can we use o->size() or new_obj_size instead of new_obj->size()?
>
>> If you feel that the use of new_obj->size() is potentially unsafe then
>> the fact we return new_obj means that any use of new_obj by the caller
>> may also potentially be unsafe.
>
> In my understanding, while copying objects to a survivor space, if a
> thread creates a new_obj and sets a pointer with CAS, the other threads
> can touch the new_obj after the thread calls push_contents(new_obj)
> (Line: 239). In push_contents, OrderAccess::release_store is called
> before pushing the object as a task into a deque of workstealing
> (taskqueue.inline.hpp). If the other thread reads the task, all of copy
> for new_obj is safe.

I'm not familiar with the larger picture of the GC protocols here, but 
just looking at this code fragment in isolation if the CAS fails we read 
o->forwardee() to set new_obj. That in itself is fine because we're 
reading the field that we were testing with the CAS. But we could then 
deference new_obj before the thread that won the CAS calls 
push_contents; and even if it is after push_contents we have not done an 
acquire to pair with the release-store in push_contents.

So I'm really not seeing how we can use a barrier-less CAS here.

David
-----

>
> Thank you for your helps again. I may be misunderstanding or missing
> something critical. Any comments and claims are always appreciated.
>
> Regards,
> Hiroshi
> -----------------------
> Hiroshi Horii, Ph.D.
> IBM Research - Tokyo
>
>
> David Holmes <david.holmes at oracle.com> wrote on 09/30/2016 07:16:16:
>
>> From: David Holmes <david.holmes at oracle.com>
>> To: Carsten Varming <varming at gmail.com>, Hiroshi H Horii/Japan/IBM at IBMJP
>> Cc: Tim Ellison <Tim_Ellison at uk.ibm.com>, "ppc-aix-port-
>> dev at openjdk.java.net" <ppc-aix-port-dev at openjdk.java.net>, "hotspot-
>> runtime-dev at openjdk.java.net" <hotspot-runtime-
>> dev at openjdk.java.net>, "hotspot-gc-dev at openjdk.java.net" <hotspot-
>> gc-dev at openjdk.java.net>, hotspot-compiler-dev <hotspot-compiler-
>> dev-bounces at openjdk.java.net>
>> Date: 09/30/2016 07:17
>> Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and
>> copy_to_survivor for ppc64
>>
>> On 30/09/2016 12:47 AM, Carsten Varming wrote:
>> > Dear Hiroshi,
>> >
>> > In hotspot/src/share/vm/gc/parallel/psPromotionManager.inline.hpp:266
>> > the log line reads data from the forwardee even when the CAS fails. I
>> > believe those reads will be unsafe without barriers after the copy of
>> > the content of the object.
>>
>> I find it extremely hard to reason about a barrier-less cmpxchg in
> general.
>>
>> If you feel that the use of new_obj->size() is potentially unsafe then
>> the fact we return new_obj means that any use of new_obj by the caller
>> may also potentially be unsafe.
>>
>> David
>> -----
>>
>> > hotspot/src/share/vm/gc/parallel/psPromotionManager.inline.hpp:288 same
>> > problem as in line 266
>> >
>> > I would argue that the logging should only happen if the thread
>> > successfully copied the object and CAS failures should be logged
>> > separately without reading data from the forwardee.
>> >
>> > BTW, unrelated to your change: It seems like the logging in line 266
>> > should be guarded by something like "if (log_develop_is_enabled(Trace,
>> > gc, scavenge)" like the logging in line 288.
>> >
>> > Carsten
>> >
>> > On Thu, Sep 29, 2016 at 8:00 AM, Hiroshi H Horii <HORII at jp.ibm.com
>> > <mailto:HORII at jp.ibm.com>> wrote:
>> >
>> >     Hi all,
>> >
>> >     Can I please request reviews for a change for 8154736 that improve
>> >     copy_to_survivor performance of ppc64 and aarch64?
>> >     If possible, I would like to include this change into jdk9.
>> >
>> >     8154736 includes two changes, cmpxchg and copy_to_suvivor, and the
>> >     former
>> >     was resolved as 8155949.
>> >     Now, I would like to ask a review for the remaining, copy_to_suvivor
>> >     change.
>> >
>> >     webrev:
>> >
> http://cr.openjdk.java.net/~mdoerr/8154736_copy_to_survivor/webrev.01/
>> >
> <http://cr.openjdk.java.net/~mdoerr/8154736_copy_to_survivor/webrev.01/>
>> >     JIRA: https://bugs.openjdk.java.net/browse/JDK-8154736
>> >     <https://bugs.openjdk.java.net/browse/JDK-8154736>
>> >
>> >     I tested this change with SPECjbb2013. Also, I re-check that relaxed
>> >     cmpxchg is available for changing forwarding pointers. However,
> because
>> >     this change is sensitive, we need more reviews not only from
>> >     compiler-dev,
>> >     but also from gc-dev.
>> >
>> >     Regards,
>> >     Hiroshi
>> >     -----------------------
>> >     Hiroshi Horii, Ph.D.
>> >     IBM Research - Tokyo
>> >
>> >
>> >
>> >
>> >     From:   David Holmes <david.holmes at oracle.com
>> >     <mailto:david.holmes at oracle.com>>
>> >     To:     "Doerr, Martin" <martin.doerr at sap.com
>> >     <mailto:martin.doerr at sap.com>>, Hiroshi H
>> >     Horii/Japan/IBM at IBMJP
>> >     Cc:     Tim Ellison <Tim_Ellison at uk.ibm.com
>> >     <mailto:Tim_Ellison at uk.ibm.com>>,
>> >     "ppc-aix-port-dev at openjdk.java.net
>> >     <mailto:ppc-aix-port-dev at openjdk.java.net>"
>> >     <ppc-aix-port-dev at openjdk.java.net
>> >     <mailto:ppc-aix-port-dev at openjdk.java.net>>,
>> >     "hotspot-gc-dev at openjdk.java.net
>> >     <mailto:hotspot-gc-dev at openjdk.java.net>"
>> >     <hotspot-gc-dev at openjdk.java.net
>> >     <mailto:hotspot-gc-dev at openjdk.java.net>>,
>> >     "hotspot-runtime-dev at openjdk.java.net
>> >     <mailto:hotspot-runtime-dev at openjdk.java.net>"
>> >     <hotspot-runtime-dev at openjdk.java.net
>> >     <mailto:hotspot-runtime-dev at openjdk.java.net>>
>> >     Date:   05/10/2016 19:31
>> >     Subject:        Re: RFR(M): 8154736: enhancement of cmpxchg and
>> >     copy_to_survivor for ppc64
>> >
>> >
>> >
>> >     On 10/05/2016 7:41 PM, Doerr, Martin wrote:
>> >     > Hi David,
>> >     >
>> >     > thank you very much for testing the other platforms.
>> >     >
>> >     > Here's an updated webrev:
>> >     > http://cr.openjdk.java.net/~mdoerr/8155949_relaxed_cas/webrev.01/
>> >     <http://cr.openjdk.java.net/~mdoerr/8155949_relaxed_cas/webrev.01/>
>> >
>> >     Thanks. Second test run on its way.
>> >
>> >     David
>> >     -----
>> >
>> >     > Best regards,
>> >     > Martin
>> >     >
>> >     > -----Original Message-----
>> >     > From: hotspot-runtime-dev [
>> >     mailto:hotspot-runtime-dev-bounces at openjdk.java.net
>> >     <mailto:hotspot-runtime-dev-bounces at openjdk.java.net>] On Behalf Of
>> >     David
>> >     Holmes
>> >     > Sent: Dienstag, 10. Mai 2016 11:11
>> >     > To: Hiroshi H Horii <HORII at jp.ibm.com <mailto:HORII at jp.ibm.com>>
>> >     > Cc: Tim Ellison <Tim_Ellison at uk.ibm.com
>> >     <mailto:Tim_Ellison at uk.ibm.com>>;
>> >     ppc-aix-port-dev at openjdk.java.net
>> >     <mailto:ppc-aix-port-dev at openjdk.java.net>;
>> >     hotspot-gc-dev at openjdk.java.net
>> >     <mailto:hotspot-gc-dev at openjdk.java.net>;
>> >     hotspot-runtime-dev at openjdk.java.net
>> >     <mailto:hotspot-runtime-dev at openjdk.java.net>
>> >     > Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and
>> >     copy_to_survivor for ppc64
>> >     >
>> >     > The fix seems incomplete for solaris:
>> >     >
>> >     > make/Main.gmk:232: recipe for target 'hotspot' failed
>> >     >
>> >     "/opt/jprt/T/P1/073516.daholme/s/hotspot/src/os_cpu/
>> solaris_x86/vm/atomic_solaris_x86.inline.hpp",
>> >     > line 124: Error: Too many arguments in call to
>> >     > "_Atomic_cmpxchg_long(long, volatile long*, long)".
>> >     >
>> >     "/opt/jprt/T/P1/073516.daholme/s/hotspot/src/os_cpu/
>> solaris_x86/vm/atomic_solaris_x86.inline.hpp",
>> >     > line 128: Error: Too many arguments in call to
>> >     > "_Atomic_cmpxchg_long(long, volatile long*, long)".
>> >     >
>> >     > David
>> >     >
>> >     > On 10/05/2016 5:34 PM, David Holmes wrote:
>> >     >> Hi Hiroshi,
>> >     >>
>> >     >> On 6/05/2016 8:11 PM, Hiroshi H Horii wrote:
>> >     >>> Hi David,
>> >     >>>
>> >     >>> Thank you for your comments.
>> >     >>>
>> >     >>> As Martin suggested me, I would like to separate this
> proposal to
>> >     >>>   - relaxing memory order of cmpxchg
>> >     >>>   - improvement of copy_to_survivior with relaxed cmpxchg
>> >     >>> and discuss the former first.
>> >     >>>
>> >     >>> Martin thankfully created a new webrev that include a change of
>> >     cmpxchg.
>> >     >>>
>> >     http://cr.openjdk.java.net/~mdoerr/8155949_relaxed_cas/webrev.00/
>> >     <http://cr.openjdk.java.net/~mdoerr/8155949_relaxed_cas/webrev.00/>
>> >     >>> He has already tested it with AIX, linuxx86_64, linuxppc64le and
>> >     >>> darwinintel64.
>> >     >>> (Please tell me if I need to send a new mail for this PFR)
>> >     >>
>> >     >> Please do as it will be simpler to track that way.
>> >     >>
>> >     >>>> What I would prefer to see is an additional memory_order value
>> >     (such
>> >     as
>> >     >>>> memory_order_ignored) which is the default for all methods
> declared
>> >     to
>> >     >>>> take a memory_order parameter.
>> >     >>>
>> >     >>> We added simple enum to specify memory order in atomic.hpp as
>> >     follows.
>> >     >>>
>> >     >>> typedef enum cmpxchg_cmpxchg_memory_order {
>> >     >>>   memory_order_relaxed,
>> >     >>>   memory_order_conservative
>> >     >>> } cmpxchg_memory_order;
>> >     >>>
>> >     >>> All of cmpxchg functions have an argument of
> cmpxchg_memory_order
>> >     >>> with a default value memory_order_conservative that uses the
> same
>> >     >>> semantics with the existing cmpxchg and requires no change
> for the
>> >     >>> existing
>> >     >>> callers. If you think "memory_order_ignored" is better than
>> >     >>> "memory_order_conservative", I will be happy to modify this
> change.
>> >     >>> (I just thought, "ignored" may resemble "relaxed" and may make
>> >     >>> people who are familiar with C++11's memory semantics confused.
>> >     >>> I would like to know thoughts of native speakers.)
>> >     >>
>> >     >> That is fine by me. I don't think "ignored" would be confused
> with
>> >     >> "relaxed", but "conservative" is fine.
>> >     >>
>> >     >> I will run the patch through our internal build system while you
>> >     prepare
>> >     >> the updated RFR. My only concern is "unused argument" warnings
>> >     from the
>> >     >> compiler. :)
>> >     >>
>> >     >> We are quickly running into a hard deadline with Feature Complete
>> >     >> however - possibly less than 24 hours - for hotspot changes.
> If this
>> >     >> doesn't get in in time I will see if I can shepherd it
> through the
>> >     >> approval process.
>> >     >>
>> >     >> Thanks,
>> >     >> David
>> >     >>
>> >     >>
>> >     >>> Regards,
>> >     >>> Hiroshi
>> >     >>> -----------------------
>> >     >>> Hiroshi Horii, Ph.D.
>> >     >>> IBM Research - Tokyo
>> >     >>>
>> >     >>>
>> >     >>> David Holmes <david.holmes at oracle.com
>> >     <mailto:david.holmes at oracle.com>> wrote on 05/04/2016 14:55:29:
>> >     >>>
>> >     >>>> From: David Holmes <david.holmes at oracle.com
>> >     <mailto:david.holmes at oracle.com>>
>> >     >>>> To: Hiroshi H Horii/Japan/IBM at IBMJP
>> >     >>>> Cc: hotspot-gc-dev at openjdk.java.net
>> >     <mailto:hotspot-gc-dev at openjdk.java.net>, hotspot-runtime-
>> >     >>>> dev at openjdk.java.net <mailto:dev at openjdk.java.net>,
>> >     ppc-aix-port-dev at openjdk.java.net
>> >     <mailto:ppc-aix-port-dev at openjdk.java.net>, Tim Ellison
>> >     >>>> <Tim_Ellison at uk.ibm.com <mailto:Tim_Ellison at uk.ibm.com>>,
>> >     Volker Simonis <volker.simonis at gmail.com
>> >     <mailto:volker.simonis at gmail.com>>,
>> >     >>>> "Doerr, Martin" <martin.doerr at sap.com
>> >     <mailto:martin.doerr at sap.com>>, "Lindenmaier, Goetz"
>> >     >>>> <goetz.lindenmaier at sap.com <mailto:goetz.lindenmaier at sap.com>>
>> >     >>>> Date: 05/04/2016 14:57
>> >     >>>> Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and
>> >     >>>> copy_to_survivor for ppc64
>> >     >>>>
>> >     >>>> Hi Hiroshi,
>> >     >>>>
>> >     >>>> Sorry for the delay on getting back to this.
>> >     >>>>
>> >     >>>> On 25/04/2016 5:09 PM, Hiroshi H Horii wrote:
>> >     >>>>> Hi David,
>> >     >>>>>
>> >     >>>>> Thank you for your comments and questions.
>> >     >>>>>
>> >     >>>>>> 1. Are the current cmpxchg semantics exactly the same as
>> >     >>>>>> memory_order_seq_cst?
>> >     >>>>>
>> >     >>>>> This is very good question..
>> >     >>>>>
>> >     >>>>> I guess, cmpxchg needs a more conservative constraint for
> memory
>> >     >>> ordering
>> >     >>>>> than C++11, to add sync after a compare-and-exchange
> operation.
>> >     >>>>>
>> >     >>>>> Could someone give comments or thoughts?
>> >     >>>>
>> >     >>>> I don't want to comment on the comparison with C++11. What
> I would
>> >     >>>> prefer to see is an additional memory_order value (such as
>> >     >>>> memory_order_ignored) which is the default for all methods
> declared
>> >     to
>> >     >>>> take a memory_order parameter. That way existing
>> >     implementations are
>> >     >>>> clearly ignoring the memory_order attribute and there is no
>> >     potential
>> >     >>>> for confusion as to whether the existing implementations
> equate to
>> >     >>>> memory_order_seq_cst or not.
>> >     >>>>
>> >     >>>> That said, I'm not sure it makes sense to add the memory_order
>> >     parameter
>> >     >>>> to all methods with "cas" in their name, e.g.
>> >     oopDesc::cas_set_mark,
>> >     >>>> oopDesc::cas_forward_to, unless those methods can sensibly be
>> >     called
>> >     >>>> with any value for memory_order - which seems highly unlikely.
>> >     Perhaps
>> >     >>>> those methods should identify the weakest form of
> memory_order they
>> >     >>>> support and that should be hard-wired into them?
>> >     >>>>
>> >     >>>> Thanks,
>> >     >>>> David
>> >     >>>>
>> >     >>>>> memory_order_seq_cst is defined as
>> >     >>>>>     "Any operation with this memory order is both an acquire
>> >     >>> operation and
>> >     >>>>>      a release operation, plus a single total order exists in
>> >     which
>> >     >>>> all
>> >     >>>>> threads
>> >     >>>>>      observe all modifications (see below) in the same order."
>> >     >>>>> (http://en.cppreference.com/w/cpp/atomic/memory_order
>> >     <http://en.cppreference.com/w/cpp/atomic/memory_order>)
>> >     >>>>>
>> >     >>>>> In my environment, g++ and xlc generate following
> assemblies on
>> >     >>>> ppc64le.
>> >     >>>>> (interestingly, they generates the same assemblies for any
>> >     >>>> memory_order)
>> >     >>>>>
>> >     >>>>> g++ (4.9.2)
>> >     >>>>>     100008a4:   ac 04 00 7c     sync
>> >     >>>>>     100008a8:   28 50 20 7d     lwarx   r9,0,r10
>> >     >>>>>     100008ac:   00 18 09 7c     cmpw    r9,r3
>> >     >>>>>     100008b0:   0c 00 c2 40     bne-    100008bc
>> >     >>>>>     100008b4:   2d 51 80 7c     stwcx.  r4,0,r10
>> >     >>>>>     100008b8:   f0 ff c2 40     bne-    100008a8
>> >     >>>>>     100008bc:   2c 01 00 4c     isync
>> >     >>>>>
>> >     >>>>> xlc (13.1.3)
>> >     >>>>>     10000888:   ac 04 00 7c     sync
>> >     >>>>>     1000088c:   28 28 c0 7c     lwarx   r6,0,r5
>> >     >>>>>     10000890:   40 00 26 7c     cmpld   r6,r0
>> >     >>>>>     10000894:   0c 00 82 40     bne     100008a0
>> >     >>>>>     10000898:   2d 29 80 7c     stwcx.  r4,0,r5
>> >     >>>>>     1000089c:   f0 ff e2 40     bne+    1000088c
>> >     >>>>>     100008a0:   2c 01 00 4c     isync
>> >     >>>>>
>> >     >>>>> On the other hand, the current OpenJDK generates following
>> >     assemblies.
>> >     >>>>>
>> >     >>>>>     508:   ac 04 00 7c     sync
>> >     >>>>>     50c:   00 00 5c e9     ld      r10,0(r28)
>> >     >>>>>     510:   00 50 3b 7c     cmpd    r27,r10
>> >     >>>>>     514:   1c 00 c2 40     bne-    530
>> >     >>>>>     518:   a8 40 5c 7d     ldarx   r10,r28,r8
>> >     >>>>>     51c:   00 50 3b 7c     cmpd    r27,r10
>> >     >>>>>     520:   10 00 c2 40     bne-    530
>> >     >>>>>     524:   ad 41 3c 7d     stdcx.  r9,r28,r8
>> >     >>>>>     528:   f0 ff c2 40     bne-    518
>> >     >>>>>     52c:   ac 04 00 7c     sync
>> >     >>>>>     530:   00 50 bb 7f     ...
>> >     >>>>>
>> >     >>>>> Though we can ignore 50c-514 (because they are a
> duplicated guard
>> >     >>>>> condition),
>> >     >>>>> the last sync instruction (52c) makes cmpxchg more strict than
>> >     >>>>> memory_order_seq_cst.
>> >     >>>>>
>> >     >>>>> In some cases, the last sync is necessary when this thread
> must be
>> >     >>>> able
>> >     >>>>> to read
>> >     >>>>> all of the changes in the other threads while executing from
>> >     508 to
>> >     >>>> 530
>> >     >>>>> (that processes compare-and-exchange).
>> >     >>>>>
>> >     >>>>>> 2. Has there been a discussion already, establishing that the
>> >     >>>> modified
>> >     >>>>>> GC code can indeed use memory_order_relaxed? Otherwise who is
>> >     >>>>>> postulating that and based on what evidence?
>> >     >>>>>
>> >     >>>>> Volker and his colleagues have investigated the current GC
> codes
>> >     >>>>> according to this.
>> >     >>>>>
>> >     http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-
>> >     <http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016->
>> >     >>>> April/019079.html
>> >     >>>>> However, I believe, we need comments of other GC expertsto
> change
>> >     >>>>> the shared codes.
>> >     >>>>>
>> >     >>>>> Regards,
>> >     >>>>> Hiroshi
>> >     >>>>> -----------------------
>> >     >>>>> Hiroshi Horii, Ph.D.
>> >     >>>>> IBM Research - Tokyo
>> >     >>>>>
>> >     >>>>>
>> >     >>>>> David Holmes <david.holmes at oracle.com
>> >     <mailto:david.holmes at oracle.com>> wrote on 04/22/2016 21:57:07:
>> >     >>>>>
>> >     >>>>>> From: David Holmes <david.holmes at oracle.com
>> >     <mailto:david.holmes at oracle.com>>
>> >     >>>>>> To: Hiroshi H Horii/Japan/IBM at IBMJP, hotspot-runtime-
>> >     >>>>>> dev at openjdk.java.net <mailto:dev at openjdk.java.net>,
>> >     hotspot-gc-dev at openjdk.java.net
> <mailto:hotspot-gc-dev at openjdk.java.net>
>> >     >>>>>> Cc: Tim Ellison <Tim_Ellison at uk.ibm.com
>> >     <mailto:Tim_Ellison at uk.ibm.com>>,
>> >     >>>>> ppc-aix-port-dev at openjdk.java.net
>> >     <mailto:ppc-aix-port-dev at openjdk.java.net>
>> >     >>>>>> Date: 04/22/2016 21:58
>> >     >>>>>> Subject: Re: RFR(M): 8154736: enhancement of cmpxchg and
>> >     >>>>>> copy_to_survivor for ppc64
>> >     >>>>>>
>> >     >>>>>> Hi Hiroshi,
>> >     >>>>>>
>> >     >>>>>> Two initial questions:
>> >     >>>>>>
>> >     >>>>>> 1. Are the current cmpxchg semantics exactly the same as
>> >     >>>>>> memory_order_seq_cst?
>> >     >>>>>>
>> >     >>>>>> 2. Has there been a discussion already, establishing that the
>> >     >>>> modified
>> >     >>>>>> GC code can indeed use memory_order_relaxed? Otherwise who is
>> >     >>>>>> postulating that and based on what evidence?
>> >     >>>>>>
>> >     >>>>>> Missing memory barriers have caused very difficult to
> track down
>> >     >>> bugs in
>> >     >>>>>> the past - very rare race conditions. So any relaxation
> here has
>> >     >>>> to be
>> >     >>>>>> done with extreme confidence.
>> >     >>>>>>
>> >     >>>>>> Thanks,
>> >     >>>>>> David
>> >     >>>>>>
>> >     >>>>>> On 22/04/2016 10:28 PM, Hiroshi H Horii wrote:
>> >     >>>>>>> Dear all:
>> >     >>>>>>>
>> >     >>>>>>> Can I please request reviews for the following change?
>> >     >>>>>>>
>> >     >>>>>>> Code change:
>> >     >>>>>>>
>> >     >>>
>> >
> http://cr.openjdk.java.net/~mdoerr/8154736_copy_to_survivor/webrev.00/
>> >
> <http://cr.openjdk.java.net/~mdoerr/8154736_copy_to_survivor/webrev.00/>
>> >     >>>>>>> (I initially created and Martin enhanced so much)
>> >     >>>>>>>
>> >     >>>>>>> This change follows the discussion started from this mail.
>> >     >>>>>>>
>> >     http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-
>> >     <http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016->
>> >     >>>>>> April/018960.html
>> >     >>>>>>>
>> >     >>>>>>> Description:
>> >     >>>>>>> This change provides relaxed compare-and-exchange by
> introducing
>> >     >>>>>>> similar semantics of C++ atomic memory operators, enum
>> >     >>>> memory_order.
>> >     >>>>>>> As described in atomic_linux_ppc.inline.hpp, the current
>> >     >>>>> implementation of
>> >     >>>>>>> cmpxchg is fence_cmpxchg_acquire. This implementation is
> useful
>> >     for
>> >     >>>>>>> general purposes because twice calls of sync before and
> after
>> >     >>>>> cmpxchg will
>> >     >>>>>>> provide strict consistency. However, they sometimes cause
>> >     overheads
>> >     >>>>>>> because
>> >     >>>>>>> sync instructions are very expensive in the current
> POWER chip
>> >     >>> design.
>> >     >>>>>>> In addition, for the other platforms, such as aarch64, this
>> >     strict
>> >     >>>>>>> semantics
>> >     >>>>>>> may cause some overheads (according to the Andrew's mail).
>> >     >>>>>>>
>> >     http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016-
>> >     <http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2016->
>> >     >>>>>> April/019073.html
>> >     >>>>>>>
>> >     >>>>>>> With this change, callers can explicitly specify
> constraints of
>> >     >>> memory
>> >     >>>>>>> ordering
>> >     >>>>>>> for cmpxchg with an additional parameter, memory_order
> order.
>> >     >>>>>>>
>> >     >>>>>>> typedef enum memory_order {
>> >     >>>>>>>    memory_order_relaxed,
>> >     >>>>>>>    memory_order_consume,
>> >     >>>>>>>    memory_order_acquire,
>> >     >>>>>>>    memory_order_release,
>> >     >>>>>>>    memory_order_acq_rel,
>> >     >>>>>>>    memory_order_seq_cst
>> >     >>>>>>> } memory_order;
>> >     >>>>>>>
>> >     >>>>>>> Because the default value of the parameter is
>> >     memory_order_seq_cst,
>> >     >>>>>>> existing codes can use the same semantics of cmpxchg
> without any
>> >     >>>>>>> modification. The relaxed cmpxchg is implemented only on ppc
>> >     >>>>>>> in this changeset. Therefore, the behavior on the other
>> >     platforms
>> >     >>> will
>> >     >>>>>>> not be changed with this changeset.
>> >     >>>>>>>
>> >     >>>>>>> In addition, with the new parameter of cmpxchg, this change
>> >     >>>> improves
>> >     >>>>>>> performance of copy_to_survivor in the parallel GC.
>> >     >>>>>>> copy_to_survivor changes forward pointers by using
> cmpxchg. This
>> >     >>>>>>> operation doesn't require any sync instructions.  A
> pointer is
>> >     >>> changed
>> >     >>>>>>> at most once in a GC and when cmpxchg fails, the latest
>> >     pointer is
>> >     >>>>>>> available for the caller. cas_set_mark and
> cas_forward_to are
>> >     >>> extended
>> >     >>>>>>> with an additional memory_order parameter as cmpxchg and
>> >     >>>>> copy_to_survivor
>> >     >>>>>>> uses memory_order_relaxed to modify the forward pointers.
>> >     >>>>>>>
>> >     >>>>>>> Summary of source code changes:
>> >     >>>>>>>
>> >     >>>>>>> * src/share/vm/runtime/atomic.hpp
>> >     >>>>>>>       - Defines enum memory_order and adds a parameter to
>> >     cmpxchg.
>> >     >>>>>>>
>> >     >>>>>>> * src/share/vm/runtime/atomic.cpp
>> >     >>>>>>> * src/os_cpu/bsd_x86/vm/atomic_bsd_x86.inline.hpp
>> >     >>>>>>> * src/os_cpu/bsd_zero/vm/atomic_bsd_zero.inline.hpp
>> >     >>>>>>> *
> src/os_cpu/linux_aarch64/vm/atomic_linux_aarch64.inline.hpp
>> >     >>>>>>> * src/os_cpu/linux_sparc/vm/atomic_linux_sparc.inline.hpp
>> >     >>>>>>> * src/os_cpu/linux_x86/vm/atomic_linux_x86.inline.hpp
>> >     >>>>>>> * src/os_cpu/linux_zero/vm/atomic_linux_zero.inline.hpp
>> >     >>>>>>> *
> src/os_cpu/solaris_sparc/vm/atomic_solaris_sparc.inline.hpp
>> >     >>>>>>> * src/os_cpu/solaris_x86/vm/atomic_solaris_x86.inline.hpp
>> >     >>>>>>> * src/os_cpu/windows_x86/vm/atomic_windows_x86.inline.hpp
>> >     >>>>>>>       - Added a parameter for each cmpxchg function to
> follow
>> >     >>>>>>>          the change of atomic.hpp. Their implementations
> are not
>> >     >>>>> changed.
>> >     >>>>>>>
>> >     >>>>>>> * src/os_cpu/aix_ppc/vm/atomic_aix_ppc.inline.hpp
>> >     >>>>>>> * src/os_cpu/linux_ppc/vm/atomic_linux_ppc.inline.hpp
>> >     >>>>>>>       - Added a parameter for each cmpxchg function to
> follow
>> >     >>>>>>>          the change of atomic.hpp. In addition,
> implementations
>> >     >>>>>>>          are changed corresponding to the specified
>> >     memory_order.
>> >     >>>>>>>
>> >     >>>>>>> * src/share/vm/oops/oop.hpp
>> >     >>>>>>> * src/share/vm/oops/oop.inline.hpp
>> >     >>>>>>>       - Add a memory_order parameter to use relaxed
> cmpxchg in
>> >     >>>>>>>          cas_set_mark and cas_forward_to.
>> >     >>>>>>>
>> >     >>>>>>> * src/share/vm/gc/parallel/psPromotionManager.cpp
>> >     >>>>>>> * src/share/vm/gc/parallel/psPromotionManager.inline.hpp
>> >     >>>>>>>
>> >     >>>>>>> Martin tested this changeset  on linuxx86_64,
> linuxppc64le and
>> >     >>>>>>> darwinintel64.
>> >     >>>>>>> Though more time is needed to test on the other platform, we
>> >     would
>> >     >>>>> like to
>> >     >>>>>>> ask
>> >     >>>>>>> reviews and start discussion on this changeset.
>> >     >>>>>>> I also tested this changeset with SPECjbb2013 and
> confirmed that
>> >     gc
>> >     >>>>> pause
>> >     >>>>>>> time
>> >     >>>>>>> is reduced.
>> >     >>>>>>>
>> >     >>>>>>> Regards,
>> >     >>>>>>> Hiroshi
>> >     >>>>>>> -----------------------
>> >     >>>>>>> Hiroshi Horii, Ph.D.
>> >     >>>>>>> IBM Research - Tokyo
>> >     >>>>>>>
>> >     >>>>>>>
>> >     >>>>>>
>> >     >>>>>
>> >     >>>>
>> >     >>>
>> >
>> >
>> >
>> >
>> >
>> >
>>
>

From alexander.vorobyev at oracle.com  Fri Sep 30 11:31:25 2016
From: alexander.vorobyev at oracle.com (Alexander Vorobyev)
Date: Fri, 30 Sep 2016 14:31:25 +0300
Subject: Request for review: JDK-8145728:
	compiler/cpuflags/TestAESIntrinsicsOnSupportedConfig.java Expected
	message not found:
	'com.sun.crypto.provider.AESCrypt::(implEncryptBlock|implDecryptBlock)
	([0-9]+ bytes) (intrinsic) not found on supported platfroms
In-Reply-To: <4949ae4e-d2f5-f09b-7c8c-ea99cc61351e@oracle.com>
References: <542E8041.1010101@oracle.com>
	<3f1a7b1e-1ec1-6af6-5b38-84eae3ba4d40@oracle.com>
	<4949ae4e-d2f5-f09b-7c8c-ea99cc61351e@oracle.com>
Message-ID: <fe5145c4-7bcb-efe1-0f31-e1a52ee62028@oracle.com>

Do you mean with -XX:+TieredCompilation -XX:TieredStopAtLevel=1? There 
are no reports about such failures - all other compiler/cpuflags tests 
pass judging by existing test runs results. Just in case, I run 
compiler/cpuflags tests manually - no failures.


On 29.09.2016 19:38, Vladimir Kozlov wrote:
> Looks good. Did you run all compiler/cpuflags tests to verify that we 
> don't need to fix other tests too?
>
> Thanks,
> Vladimir
>
> On 9/29/16 9:30 AM, Alexander Vorobyev wrote:
>>
>> Hi All,
>>
>> I'd like review for JDK-8145728
>> (https://bugs.openjdk.java.net/browse/JDK-8145728)
>>
>> Judging by the test results, test fails with specific compiler options:
>> -XX:+TieredCompilation -XX:TieredStopAtLevel=N, where N<4. In this case
>> C2 is not used and we are not able to see intrinsics usage in the test
>> log. So such configuration is not valid for this test and should not be
>> used. Supposed fix is to prevent this test from accepting such options.
>>
>> "@requires" tag was added:
>> @requires vm.opt.TieredStopAtLevel == null | vm.opt.TieredStopAtLevel 
>> == 4
>>
>>
>> Here is webrev:
>> http://cr.openjdk.java.net/~avorobye/8145728/webrew.00/
>>
>>
>> Thanks,
>> Alexander
>>
>>
>>


From thomas.schatzl at oracle.com  Fri Sep 30 12:02:31 2016
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 30 Sep 2016 14:02:31 +0200
Subject: RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor
	for ppc64
In-Reply-To: <1e40040e-b494-6e1e-00a4-dc130954cebd@oracle.com>
References: <201604221228.u3MCSXCL020021@d19av07.sagamino.japan.ibm.com>
	<571A1FA3.9030006@oracle.com>
	<201604250709.u3P79jwN024101@d19av07.sagamino.japan.ibm.com>
	<1574d9e7-c9cd-b1e8-e9a1-d63630713724@oracle.com>
	<201605061011.u46ABZDR015108@d19av07.sagamino.japan.ibm.com>
	<848a70ad-00b3-b742-fa4e-87dc0124e0e3@oracle.com>
	<347b1733-fbbc-b65b-5417-7be52a0b5d68@oracle.com>
	<0e47ed4857d94f9bbd99b0738bf1708a@DEWDFE13DE14.global.corp.sap>
	<f5826c30-0e12-8af9-9f78-3e7fd173b899@oracle.com>
	<OFE8C20C07.4A5437DD-ON4925803D.0040476D-4925803D.0041F53D@notes.na.collabserv.com>
	<CAP_pwnWpE9OhRA-XxTjKAq4T2rLjnLXLDomkBvAPdJ1G8XEjQw@mail.gmail.com>
	<f52703e8-67b9-0852-540e-a31e5dca1c1e@oracle.com>
	<OFA2287681.8B1427FA-ON4925803E.0035621E-4925803E.00387EBB@notes.na.collabserv.com>
	<1e40040e-b494-6e1e-00a4-dc130954cebd@oracle.com>
Message-ID: <1475236951.6301.72.camel@oracle.com>

Hi,

On Fri, 2016-09-30 at 21:12 +1000, David Holmes wrote:
> On 30/09/2016 8:17 PM, Hiroshi H Horii wrote:
> > 
> > Dear David, and Dan,
> > 
> > Thank you for your comments.
> > 
> > > 
> > > In
> > > hotspot/src/share/vm/gc/parallel/psPromotionManager.inline.hpp:
> > > 266 the log line reads data from the forwardee even when the CAS
> > > fails. I believe those reads will be unsafe without barriers
> > > after
> > > the copy of the content of the object.
> > > hotspot/src/share/vm/gc/parallel/psPromotionManager.inline.hpp:28
> > > 8
> > > same problem as in line 266
> > Can we use o->size() or new_obj_size instead of new_obj->size()?

They are not equivalent. Parallel GC and other collectors creatively
reuse the "length" field of objArrays to indicate progress in the
scanning them during GC.

new_obj_size is the result of a call to o->size() (and the compiler may
redo computations at any point), so has the same issue.

> > > If you feel that the use of new_obj->size() is potentially unsafe
> > > then
> > > the fact we return new_obj means that any use of new_obj by the
> > > caller
> > > may also potentially be unsafe.
> > In my understanding, while copying objects to a survivor space, if
> > a thread creates a new_obj and sets a pointer with CAS, the other
> > threads can touch the new_obj after the thread calls
> > push_contents(new_obj) (Line: 239). In push_contents,
> > OrderAccess::release_store is called before pushing the object as a
> > task into a deque of workstealing (taskqueue.inline.hpp). If the
> > other thread reads the task, all of copy for new_obj is safe.
> I'm not familiar with the larger picture of the GC protocols here,
> but just looking at this code fragment in isolation if the CAS fails
> we read o->forwardee() to set new_obj. That in itself is fine because
> we're reading the field that we were testing with the CAS. But we
> could then deference new_obj before the thread that won the CAS calls
> push_contents; and even if it is after push_contents we have not done
> an acquire to pair with the release-store in push_contents.

I think Hiroshi thinks that since the work stealing itself does a CAS
with barrier after obtaining "new_obj" in the other thread, it should
be safe (for other threads consuming an object on the task queue).

> So I'm really not seeing how we can use a barrier-less CAS here.

I also do not think it is safe as is - for example, at least
PSPromotionManager::copy_and_push_safe_barrier() reads data from the
returned new_obj (in another log message :)) regardless of failure.

That method also reads the forwardee if forwarded, and then again uses
object information in that same log message. A quick look did not show
other issues, but don't count this as a review.

Thanks,
? Thomas


From igor.ignatyev at oracle.com  Fri Sep 30 13:05:30 2016
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Fri, 30 Sep 2016 16:05:30 +0300
Subject: Request for review: JDK-8145728:
	compiler/cpuflags/TestAESIntrinsicsOnSupportedConfig.java
	Expected message not found:
	'com.sun.crypto.provider.AESCrypt::(implEncryptBlock|implDecryptBlock)
	([0-9]+ bytes) (intrinsic) not found on supported platfroms
In-Reply-To: <3f1a7b1e-1ec1-6af6-5b38-84eae3ba4d40@oracle.com>
References: <542E8041.1010101@oracle.com>
	<3f1a7b1e-1ec1-6af6-5b38-84eae3ba4d40@oracle.com>
Message-ID: <C3945102-2381-4CAF-872B-05591A220327@oracle.com>

Alexander,

your fix literally removes the test from almost all executions, because we do not set -XX:TieredStopAtLevel=4 in any configs. from my point of view, changing AESSupportPredicate class is a better way to fix this issue, since it will be reused by all other tests.

I also have a question regarding your evaluation. Basing on own comment[1], not used C2 can not be a reason why this test failed before, otherwise you would be able to reproduce this bug w/o any problems. could you please provide more detailed evaluation? 

[1] https://bugs.openjdk.java.net/browse/JDK-8145728?focusedCommentId=13996257&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13996257
It is not reproducible with the latest builds of JDK 9 (b133), even on the same host with the same options. 

Thanks,
? Igor

> On Sep 29, 2016, at 7:30 PM, Alexander Vorobyev <alexander.vorobyev at oracle.com> wrote:
> 
> 
> Hi All,
> 
> I'd like review for JDK-8145728 (https://bugs.openjdk.java.net/browse/JDK-8145728)
> 
> Judging by the test results, test fails with specific compiler options: -XX:+TieredCompilation -XX:TieredStopAtLevel=N, where N<4. In this case C2 is not used and we are not able to see intrinsics usage in the test log. So such configuration is not valid for this test and should not be used. Supposed fix is to prevent this test from accepting such options.
> 
> "@requires" tag was added:
> @requires vm.opt.TieredStopAtLevel == null | vm.opt.TieredStopAtLevel == 4
> 
> 
> Here is webrev:
> http://cr.openjdk.java.net/~avorobye/8145728/webrew.00/
> 
> 
> Thanks,
> Alexander
> 
> 
> 


From alexander.vorobyev at oracle.com  Fri Sep 30 13:51:32 2016
From: alexander.vorobyev at oracle.com (Alexander Vorobyev)
Date: Fri, 30 Sep 2016 16:51:32 +0300
Subject: Request for review: JDK-8145728:
	compiler/cpuflags/TestAESIntrinsicsOnSupportedConfig.java Expected
	message not found:
	'com.sun.crypto.provider.AESCrypt::(implEncryptBlock|implDecryptBlock)
	([0-9]+ bytes) (intrinsic) not found on supported platfroms
In-Reply-To: <C3945102-2381-4CAF-872B-05591A220327@oracle.com>
References: <542E8041.1010101@oracle.com>
	<3f1a7b1e-1ec1-6af6-5b38-84eae3ba4d40@oracle.com>
	<C3945102-2381-4CAF-872B-05591A220327@oracle.com>
Message-ID: <47fcc8c7-c986-fec5-74a5-6f5fcd35e83f@oracle.com>

About my comment. I really was not able to reproduce this issue at that 
time, because the earliest failure report has different VM options and 
does not contain  -XX:TieredStopAtLevel option.

Do you always use -XX:TieredStopAtLevel option in test runs? My fix 
allows to run this test when this option is not set 
(vm.opt.TieredStopAtLevel == null).

AESSupportPredicate class only checks CPU AES feature. It is exactly 
what it is supposed to do. Is it really necessary to add some new 
functionality (unrelated to AES feature) to it?

Thanks

On 30.09.2016 16:05, Igor Ignatyev wrote:
> Alexander,
>
> your fix literally removes the test from almost all executions, because we do not set -XX:TieredStopAtLevel=4 in any configs. from my point of view, changing AESSupportPredicate class is a better way to fix this issue, since it will be reused by all other tests.
>
> I also have a question regarding your evaluation. Basing on own comment[1], not used C2 can not be a reason why this test failed before, otherwise you would be able to reproduce this bug w/o any problems. could you please provide more detailed evaluation?
>
> [1] https://bugs.openjdk.java.net/browse/JDK-8145728?focusedCommentId=13996257&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13996257
> It is not reproducible with the latest builds of JDK 9 (b133), even on the same host with the same options.
>
> Thanks,
> ? Igor
>
>> On Sep 29, 2016, at 7:30 PM, Alexander Vorobyev <alexander.vorobyev at oracle.com> wrote:
>>
>>
>> Hi All,
>>
>> I'd like review for JDK-8145728 (https://bugs.openjdk.java.net/browse/JDK-8145728)
>>
>> Judging by the test results, test fails with specific compiler options: -XX:+TieredCompilation -XX:TieredStopAtLevel=N, where N<4. In this case C2 is not used and we are not able to see intrinsics usage in the test log. So such configuration is not valid for this test and should not be used. Supposed fix is to prevent this test from accepting such options.
>>
>> "@requires" tag was added:
>> @requires vm.opt.TieredStopAtLevel == null | vm.opt.TieredStopAtLevel == 4
>>
>>
>> Here is webrev:
>> http://cr.openjdk.java.net/~avorobye/8145728/webrew.00/
>>
>>
>> Thanks,
>> Alexander
>>
>>
>>


From igor.ignatyev at oracle.com  Fri Sep 30 13:58:06 2016
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Fri, 30 Sep 2016 16:58:06 +0300
Subject: Request for review: JDK-8145728:
	compiler/cpuflags/TestAESIntrinsicsOnSupportedConfig.java
	Expected message not found:
	'com.sun.crypto.provider.AESCrypt::(implEncryptBlock|implDecryptBlock)
	([0-9]+ bytes) (intrinsic) not found on supported platfroms
In-Reply-To: <47fcc8c7-c986-fec5-74a5-6f5fcd35e83f@oracle.com>
References: <542E8041.1010101@oracle.com>
	<3f1a7b1e-1ec1-6af6-5b38-84eae3ba4d40@oracle.com>
	<C3945102-2381-4CAF-872B-05591A220327@oracle.com>
	<47fcc8c7-c986-fec5-74a5-6f5fcd35e83f@oracle.com>
Message-ID: <02A2359A-B806-4D3A-B993-7647EB5BFA48@oracle.com>

Alexander,

please see inline below.

> On Sep 30, 2016, at 4:51 PM, Alexander Vorobyev <alexander.vorobyev at oracle.com> wrote:
> 
> About my comment. I really was not able to reproduce this issue at that time, because the earliest failure report has different VM options and does not contain  -XX:TieredStopAtLevel option.
that means there can be another issue w/ the test or the product, which hasn?t investigated and your fix can hide it.
> 
> Do you always use -XX:TieredStopAtLevel option in test runs? My fix allows to run this test when this option is not set (vm.opt.TieredStopAtLevel == null).
no we don?t.
> 
> AESSupportPredicate class only checks CPU AES feature. It is exactly what it is supposed to do. Is it really necessary to add some new functionality (unrelated to AES feature) to it?
AESSupportPredicate is supposed to check that JVM can use AES, AFAIR there is AES intrinsics support only in C2, so a disabled C2 basically means JVM can not use AES intrinsics.

Regards,
? Igor
> 
> Thanks
> 
> On 30.09.2016 16:05, Igor Ignatyev wrote:
>> Alexander,
>> 
>> your fix literally removes the test from almost all executions, because we do not set -XX:TieredStopAtLevel=4 in any configs. from my point of view, changing AESSupportPredicate class is a better way to fix this issue, since it will be reused by all other tests.
>> 
>> I also have a question regarding your evaluation. Basing on own comment[1], not used C2 can not be a reason why this test failed before, otherwise you would be able to reproduce this bug w/o any problems. could you please provide more detailed evaluation?
>> 
>> [1] https://bugs.openjdk.java.net/browse/JDK-8145728?focusedCommentId=13996257&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13996257
>> It is not reproducible with the latest builds of JDK 9 (b133), even on the same host with the same options.
>> 
>> Thanks,
>> ? Igor
>> 
>>> On Sep 29, 2016, at 7:30 PM, Alexander Vorobyev <alexander.vorobyev at oracle.com> wrote:
>>> 
>>> 
>>> Hi All,
>>> 
>>> I'd like review for JDK-8145728 (https://bugs.openjdk.java.net/browse/JDK-8145728)
>>> 
>>> Judging by the test results, test fails with specific compiler options: -XX:+TieredCompilation -XX:TieredStopAtLevel=N, where N<4. In this case C2 is not used and we are not able to see intrinsics usage in the test log. So such configuration is not valid for this test and should not be used. Supposed fix is to prevent this test from accepting such options.
>>> 
>>> "@requires" tag was added:
>>> @requires vm.opt.TieredStopAtLevel == null | vm.opt.TieredStopAtLevel == 4
>>> 
>>> 
>>> Here is webrev:
>>> http://cr.openjdk.java.net/~avorobye/8145728/webrew.00/
>>> 
>>> 
>>> Thanks,
>>> Alexander
>>> 
>>> 
>>> 
> 


From alexander.vorobyev at oracle.com  Fri Sep 30 15:06:33 2016
From: alexander.vorobyev at oracle.com (Alexander Vorobyev)
Date: Fri, 30 Sep 2016 18:06:33 +0300
Subject: Request for review: JDK-8145728:
	compiler/cpuflags/TestAESIntrinsicsOnSupportedConfig.java Expected
	message not found:
	'com.sun.crypto.provider.AESCrypt::(implEncryptBlock|implDecryptBlock)
	([0-9]+ bytes) (intrinsic) not found on supported platfroms
In-Reply-To: <02A2359A-B806-4D3A-B993-7647EB5BFA48@oracle.com>
References: <542E8041.1010101@oracle.com>
	<3f1a7b1e-1ec1-6af6-5b38-84eae3ba4d40@oracle.com>
	<C3945102-2381-4CAF-872B-05591A220327@oracle.com>
	<47fcc8c7-c986-fec5-74a5-6f5fcd35e83f@oracle.com>
	<02A2359A-B806-4D3A-B993-7647EB5BFA48@oracle.com>
Message-ID: <fa55bc97-8076-e115-5132-63b392d4e9bb@oracle.com>


On 30.09.2016 16:58, Igor Ignatyev wrote:
> Alexander,
>
> please see inline below.
>
>> On Sep 30, 2016, at 4:51 PM, Alexander Vorobyev <alexander.vorobyev at oracle.com> wrote:
>>
>> About my comment. I really was not able to reproduce this issue at that time, because the earliest failure report has different VM options and does not contain  -XX:TieredStopAtLevel option.
> that means there can be another issue w/ the test or the product, which hasn?t investigated and your fix can hide it.
I don't think my fix can hide it, because it does not use VM 
options/configurations from the earliest failure reports. Only issue my 
fix is targeted for is invalid VM configuration with 
-XX:TieredStopAtLevel option.
>> Do you always use -XX:TieredStopAtLevel option in test runs? My fix allows to run this test when this option is not set (vm.opt.TieredStopAtLevel == null).
> no we don?t.
>> AESSupportPredicate class only checks CPU AES feature. It is exactly what it is supposed to do. Is it really necessary to add some new functionality (unrelated to AES feature) to it?
> AESSupportPredicate is supposed to check that JVM can use AES, AFAIR there is AES intrinsics support only in C2, so a disabled C2 basically means JVM can not use AES intrinsics.
Maybe I was wrong. For now,  AESSupportPredicate uses CPUInfo 
class which shows us exactly CPU features, not JVM. And, for example, 
TestAESIntrinsicsOnUnsupportedConfig.java expects exactly such 
behaviour. Because "UnsupportedConfig" means CPU with no AES feature. On 
such CPU we will see "AES instructions are not available on this CPU" 
warning (TestAESIntrinsicsOnUnsupportedConfig expects this warning) in 
the test log, but on CPU with AES feature and with 
-XX:TieredStopAtLevel=1 option (you suppose to make AESSupportPredicate 
return FALSE for  this configuration, right?) we won't. In result, we 
will have TestAESIntrinsicsOnUnsupportedConfig failures on platforms 
where this test is not even supposed to be run. Please correct me if I 
misunderstand your idea.
>
> Regards,
> ? Igor
>> Thanks
>>
>> On 30.09.2016 16:05, Igor Ignatyev wrote:
>>> Alexander,
>>>
>>> your fix literally removes the test from almost all executions, because we do not set -XX:TieredStopAtLevel=4 in any configs. from my point of view, changing AESSupportPredicate class is a better way to fix this issue, since it will be reused by all other tests.
>>>
>>> I also have a question regarding your evaluation. Basing on own comment[1], not used C2 can not be a reason why this test failed before, otherwise you would be able to reproduce this bug w/o any problems. could you please provide more detailed evaluation?
>>>
>>> [1] https://bugs.openjdk.java.net/browse/JDK-8145728?focusedCommentId=13996257&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13996257
>>> It is not reproducible with the latest builds of JDK 9 (b133), even on the same host with the same options.
>>>
>>> Thanks,
>>> ? Igor
>>>
>>>> On Sep 29, 2016, at 7:30 PM, Alexander Vorobyev <alexander.vorobyev at oracle.com> wrote:
>>>>
>>>>
>>>> Hi All,
>>>>
>>>> I'd like review for JDK-8145728 (https://bugs.openjdk.java.net/browse/JDK-8145728)
>>>>
>>>> Judging by the test results, test fails with specific compiler options: -XX:+TieredCompilation -XX:TieredStopAtLevel=N, where N<4. In this case C2 is not used and we are not able to see intrinsics usage in the test log. So such configuration is not valid for this test and should not be used. Supposed fix is to prevent this test from accepting such options.
>>>>
>>>> "@requires" tag was added:
>>>> @requires vm.opt.TieredStopAtLevel == null | vm.opt.TieredStopAtLevel == 4
>>>>
>>>>
>>>> Here is webrev:
>>>> http://cr.openjdk.java.net/~avorobye/8145728/webrew.00/
>>>>
>>>>
>>>> Thanks,
>>>> Alexander
>>>>
>>>>
>>>>


From HORIE at jp.ibm.com  Fri Sep 30 13:30:36 2016
From: HORIE at jp.ibm.com (Michihiro Horie)
Date: Fri, 30 Sep 2016 13:30:36 +0000
Subject: RFR:8166684:implement intrinsic code with vector instructions
	for Unsafe.copyMemory()
In-Reply-To: <dfadb7d6aea04ce9812f7da840bdc0ae@DEWDFE13DE50.global.corp.sap>
References: <dfadb7d6aea04ce9812f7da840bdc0ae@DEWDFE13DE50.global.corp.sap>,
	<OFDCE69545.F72C83D9-ON0025803A.00284CD7-4925803A.0029E067@notes.na.collabserv.com>
	<4c013cabdeeb476f97c427643aef7a1b@DEWDFE13DE14.global.corp.sap>
Message-ID: <OFC96718A5.164E004B-ON0025803E.0048DDF2-0025803E.004A36BE@notes.na.collabserv.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160930/6622c8a9/attachment.html>

From martin.doerr at sap.com  Fri Sep 30 16:00:23 2016
From: martin.doerr at sap.com (Doerr, Martin)
Date: Fri, 30 Sep 2016 16:00:23 +0000
Subject: RFR:8166684:implement intrinsic code with vector instructions
	for	Unsafe.copyMemory()
In-Reply-To: <OFC96718A5.164E004B-ON0025803E.0048DDF2-0025803E.004A36BE@notes.na.collabserv.com>
References: <dfadb7d6aea04ce9812f7da840bdc0ae@DEWDFE13DE50.global.corp.sap>,
	<OFDCE69545.F72C83D9-ON0025803A.00284CD7-4925803A.0029E067@notes.na.collabserv.com>
	<4c013cabdeeb476f97c427643aef7a1b@DEWDFE13DE14.global.corp.sap>
	<OFC96718A5.164E004B-ON0025803E.0048DDF2-0025803E.004A36BE@notes.na.collabserv.com>
Message-ID: <690881fb156d4d0a83dc31a01e50b4ec@DEWDFE13DE14.global.corp.sap>

Hi Michihiro,

thanks for contributing this change. Looks good, now. We will test it.
We?ll push it if it gets approved and reviewed.

Best regards,
Martin

From: Michihiro Horie [mailto:HORIE at jp.ibm.com]
Sent: Freitag, 30. September 2016 15:31
To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>
Cc: gromero at linux.vnet.ibm.com; Hiroshi H Horii <HORII at jp.ibm.com>; Doerr, Martin <martin.doerr at sap.com>; ppc-aix-port-dev at openjdk.java.net; Simonis, Volker <volker.simonis at sap.com>; hotspot-compiler-dev at openjdk.java.net
Subject: Re: RE: RFR:8166684:implement intrinsic code with vector instructions for Unsafe.copyMemory()

Hi Goetz, Martin,

Would you review this? The initialization of tmp1 is now outside the loop.

JIRA: https://bugs.openjdk.java.net/browse/JDK-8166684
Webrev: http://cr.openjdk.java.net/~horii/8166684/webrev.01/

We created webrev by ourselves, and cced to hotspot-compiler-dev.

Best regards,
--
Michihiro,
IBM Research - Tokyo


----- Original message -----
From: "Lindenmaier, Goetz" <goetz.lindenmaier at sap.com<mailto:goetz.lindenmaier at sap.com>>
To: "Doerr, Martin" <martin.doerr at sap.com<mailto:martin.doerr at sap.com>>, Michihiro Horie/Japan/IBM at IBMJP, "Simonis, Volker" <volker.simonis at sap.com<mailto:volker.simonis at sap.com>>, "ppc-aix-port-dev at openjdk.java.net<mailto:ppc-aix-port-dev at openjdk.java.net>" <ppc-aix-port-dev at openjdk.java.net<mailto:ppc-aix-port-dev at openjdk.java.net>>
Cc: Hiroshi H Horii/Japan/IBM at IBMJP, Gustavo Romero <gromero at linux.vnet.ibm.com<mailto:gromero at linux.vnet.ibm.com>>
Subject: RE: RFR:8166684:implement intrinsic code with vector instructions for Unsafe.copyMemory()
Date: Mon, Sep 26, 2016 10:51 PM

Hi,

please post this RFR also to hotspot-compiler-dev. It must be reviewed
on one of the official lists before it can be pushed. Ppc-aix-port-dev
is only for communication about the port, not for reviews.

Also I would appreciate if you could upload your webrevs yourselves.
We are happy to help out in the beginning, and also with testing,
reviewing and pushing, but making webrevs is a task I don't see on
our side in the long term.

Thanks and best regards,
  Goetz.

> -----Original Message-----
> From: ppc-aix-port-dev [mailto:ppc-aix-port-dev-
> bounces at openjdk.java.net<mailto:bounces at openjdk.java.net>] On Behalf Of Doerr, Martin
> Sent: Montag, 26. September 2016 11:53
> To: Michihiro Horie <HORIE at jp.ibm.com<mailto:HORIE at jp.ibm.com>>; Simonis, Volker
> <volker.simonis at sap.com<mailto:volker.simonis at sap.com>>; ppc-aix-port-dev at openjdk.java.net<mailto:ppc-aix-port-dev at openjdk.java.net>
> Cc: Hiroshi H Horii <HORII at jp.ibm.com<mailto:HORII at jp.ibm.com>>; Gustavo Romero
> <gromero at linux.vnet.ibm.com<mailto:gromero at linux.vnet.ibm.com>>
> Subject: RE: RFR:8166684:implement intrinsic code with vector instructions
> for Unsafe.copyMemory()
>
> Hi Michihiro,
>
>
>
> the initialization of tmp1 should be done outside of the loop. Beside that, the
> change looks good:
>
> http://cr.openjdk.java.net/~mdoerr/8166684_PPC64_unsafe_copymemory/
> webrev.00/
> <http://cr.openjdk.java.net/~mdoerr/8166684_PPC64_unsafe_copymemory
> /webrev.00/>
>
>
>
> Best regards,
>
> Martin
>
>
>
>
>
> From: Michihiro Horie [mailto:HORIE at jp.ibm.com]
> Sent: Montag, 26. September 2016 09:37
> To: Doerr, Martin <martin.doerr at sap.com<mailto:martin.doerr at sap.com>>; Simonis, Volker
> <volker.simonis at sap.com<mailto:volker.simonis at sap.com>>; ppc-aix-port-dev at openjdk.java.net<mailto:ppc-aix-port-dev at openjdk.java.net>
> Cc: volker.simonis at gmail.com<mailto:volker.simonis at gmail.com>; Gustavo Romero
> <gromero at linux.vnet.ibm.com<mailto:gromero at linux.vnet.ibm.com>>; Hiroshi H Horii <HORII at jp.ibm.com<mailto:HORII at jp.ibm.com>>
> Subject: RFR:8166684:implement intrinsic code with vector instructions for
> Unsafe.copyMemory()
>
>
>
> Dear all,
>
> Could I please request reviews for the following change?
> This change was created for JDK9.
>
> I added fixes to the intrinsic code for sun.misc.Unsafe.copyMemory() by
> using VSX.
> Since Spark often invokes Unsafe.copyMemory(), it is beneficial to use the
> vector instructions for these intrinsic code.
>
> jira: https://bugs.openjdk.java.net/browse/JDK-8166684
> <https://bugs.openjdk.java.net/browse/JDK-8166684>
> diff: (See attached file: unsafe-copymemory-openjdk9.diff)
>
> Best regards,
> --
> Michihiro Horie,
> IBM Research - Tokyo


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160930/ad2d8da6/attachment-0001.html>

From tom.rodriguez at oracle.com  Fri Sep 30 16:07:27 2016
From: tom.rodriguez at oracle.com (Tom Rodriguez)
Date: Fri, 30 Sep 2016 09:07:27 -0700
Subject: RFR 8166929: [JVMCI] Expose decompile counts in MDO
In-Reply-To: <CAHjP37HAuG9vkRN5JA0mgTUOXOOJxQLh-5qdnrNiBp-QAh+3Lg@mail.gmail.com>
References: <54C73C2D-F44F-491E-92C3-79DE73CE7B8F@oracle.com>
	<CAHjP37HAuG9vkRN5JA0mgTUOXOOJxQLh-5qdnrNiBp-QAh+3Lg@mail.gmail.com>
Message-ID: <0C3B6756-89C4-4CB2-BA71-A509C45FF82F@oracle.com>


> On Sep 29, 2016, at 7:16 PM, Vitaly Davidovich <vitalyd at gmail.com> wrote:
> 
> Quick fly-by comment: HotSpotMethodData::toString should use %d for overflow recompiles count printing, like the other counters.

Thanks, that was a typo.  Fixed and updated in place.

tom

> 
> Thanks
> 
> On Thursday, September 29, 2016, Tom Rodriguez <tom.rodriguez at oracle.com <mailto:tom.rodriguez at oracle.com>> wrote:
> http://cr.openjdk.java.net/~never/8166929/webrev <http://cr.openjdk.java.net/~never/8166929/webrev>
> https://bugs.openjdk.java.net/browse/JDK-8166929 <https://bugs.openjdk.java.net/browse/JDK-8166929>
> 
> This is a minor API addition to expose some of the top-level MDO decompile and recompile counts.  It?s necessary to detect recompilation pathologies.  Tested by printing MDOs from JVMCI.  I also fixed a few problems I discovered with the formatting of the MDO printed form.
> 
> tom
> 
> 
> -- 
> Sent from my phone

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160930/08282fe1/attachment.html>

From cthalinger at twitter.com  Fri Sep 30 22:32:42 2016
From: cthalinger at twitter.com (Christian Thalinger)
Date: Fri, 30 Sep 2016 12:32:42 -1000
Subject: RFR 8166929: [JVMCI] Expose decompile counts in MDO
In-Reply-To: <0C3B6756-89C4-4CB2-BA71-A509C45FF82F@oracle.com>
References: <54C73C2D-F44F-491E-92C3-79DE73CE7B8F@oracle.com>
	<CAHjP37HAuG9vkRN5JA0mgTUOXOOJxQLh-5qdnrNiBp-QAh+3Lg@mail.gmail.com>
	<0C3B6756-89C4-4CB2-BA71-A509C45FF82F@oracle.com>
Message-ID: <0A743A77-9620-438C-8D04-C304744327AB@twitter.com>


> On Sep 30, 2016, at 6:07 AM, Tom Rodriguez <tom.rodriguez at oracle.com> wrote:
> 
> 
>> On Sep 29, 2016, at 7:16 PM, Vitaly Davidovich <vitalyd at gmail.com <mailto:vitalyd at gmail.com>> wrote:
>> 
>> Quick fly-by comment: HotSpotMethodData::toString should use %d for overflow recompiles count printing, like the other counters.
> 
> Thanks, that was a typo.  Fixed and updated in place.

+  COMPILER2_OR_JVMCI_PRESENT(nonstatic_field(MethodCounters, _interpreter_invocation_count,   int))                                  \
+  COMPILER2_OR_JVMCI_PRESENT(nonstatic_field(MethodCounters, _interpreter_throwout_count,     u2))                                   \
Isn?t that true always?

+    public int getDecompileCount() {
+        return UNSAFE.getInt(metaspaceMethodData + config.methodDataDecompiles);
+    }
+
+    public int getOverflowRecompileCount() {
+        return UNSAFE.getInt(metaspaceMethodData + config.methodDataOverflowRecompiles);
+    }
+
+    public int getOverflowTrapsCount() {
+        return UNSAFE.getInt(metaspaceMethodData + config.methodDataOverflowTraps);
+    }
This is high-level nitpicking:  the fields are plural but you named the methods singular except ?OverflowTraps?.  Either plural everywhere or none.

> 
> tom
> 
>> 
>> Thanks
>> 
>> On Thursday, September 29, 2016, Tom Rodriguez <tom.rodriguez at oracle.com <mailto:tom.rodriguez at oracle.com>> wrote:
>> http://cr.openjdk.java.net/~never/8166929/webrev <http://cr.openjdk.java.net/~never/8166929/webrev>
>> https://bugs.openjdk.java.net/browse/JDK-8166929 <https://bugs.openjdk.java.net/browse/JDK-8166929>
>> 
>> This is a minor API addition to expose some of the top-level MDO decompile and recompile counts.  It?s necessary to detect recompilation pathologies.  Tested by printing MDOs from JVMCI.  I also fixed a few problems I discovered with the formatting of the MDO printed form.
>> 
>> tom
>> 
>> 
>> -- 
>> Sent from my phone
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160930/359c2079/attachment.html>

From tom.rodriguez at oracle.com  Fri Sep 30 23:04:30 2016
From: tom.rodriguez at oracle.com (Tom Rodriguez)
Date: Fri, 30 Sep 2016 16:04:30 -0700
Subject: RFR 8166929: [JVMCI] Expose decompile counts in MDO
In-Reply-To: <0A743A77-9620-438C-8D04-C304744327AB@twitter.com>
References: <54C73C2D-F44F-491E-92C3-79DE73CE7B8F@oracle.com>
	<CAHjP37HAuG9vkRN5JA0mgTUOXOOJxQLh-5qdnrNiBp-QAh+3Lg@mail.gmail.com>
	<0C3B6756-89C4-4CB2-BA71-A509C45FF82F@oracle.com>
	<0A743A77-9620-438C-8D04-C304744327AB@twitter.com>
Message-ID: <E3E70163-6221-4071-9E59-4055A2A358C0@oracle.com>


> On Sep 30, 2016, at 3:32 PM, Christian Thalinger <cthalinger at twitter.com> wrote:
> 
> 
>> On Sep 30, 2016, at 6:07 AM, Tom Rodriguez <tom.rodriguez at oracle.com <mailto:tom.rodriguez at oracle.com>> wrote:
>> 
>> 
>>> On Sep 29, 2016, at 7:16 PM, Vitaly Davidovich <vitalyd at gmail.com <mailto:vitalyd at gmail.com>> wrote:
>>> 
>>> Quick fly-by comment: HotSpotMethodData::toString should use %d for overflow recompiles count printing, like the other counters.
>> 
>> Thanks, that was a typo.  Fixed and updated in place.
> 
> +  COMPILER2_OR_JVMCI_PRESENT(nonstatic_field(MethodCounters, _interpreter_invocation_count,   int))                                  \
> +  COMPILER2_OR_JVMCI_PRESENT(nonstatic_field(MethodCounters, _interpreter_throwout_count,     u2))                                   \
> Isn?t that true always?

Yes, it is here.  I copied those lines from the normal vmstructs database where it might not be true but they can be removed here.  While fixing this I also realized that InvocationCounter wasn?t declared in the JVMCI copy of vmstructs, so I added that.

> 
> +    public int getDecompileCount() {
> +        return UNSAFE.getInt(metaspaceMethodData + config.methodDataDecompiles);
> +    }
> +
> +    public int getOverflowRecompileCount() {
> +        return UNSAFE.getInt(metaspaceMethodData + config.methodDataOverflowRecompiles);
> +    }
> +
> +    public int getOverflowTrapsCount() {
> +        return UNSAFE.getInt(metaspaceMethodData + config.methodDataOverflowTraps);
> +    }
> This is high-level nitpicking:  the fields are plural but you named the methods singular except ?OverflowTraps?.  Either plural everywhere or none.

Yes I fixed that in 8 while preparing this webrev but I missed updating in it 9.  I?ve put the updated webrev at http://cr.openjdk.java.net/~never/8166929.1/webrev

tom

> 
>> 
>> tom
>> 
>>> 
>>> Thanks
>>> 
>>> On Thursday, September 29, 2016, Tom Rodriguez <tom.rodriguez at oracle.com <mailto:tom.rodriguez at oracle.com>> wrote:
>>> http://cr.openjdk.java.net/~never/8166929/webrev <http://cr.openjdk.java.net/~never/8166929/webrev>
>>> https://bugs.openjdk.java.net/browse/JDK-8166929 <https://bugs.openjdk.java.net/browse/JDK-8166929>
>>> 
>>> This is a minor API addition to expose some of the top-level MDO decompile and recompile counts.  It?s necessary to detect recompilation pathologies.  Tested by printing MDOs from JVMCI.  I also fixed a few problems I discovered with the formatting of the MDO printed form.
>>> 
>>> tom
>>> 
>>> 
>>> -- 
>>> Sent from my phone
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160930/6ae9cbd3/attachment.html>

From cthalinger at twitter.com  Fri Sep 30 23:12:08 2016
From: cthalinger at twitter.com (Christian Thalinger)
Date: Fri, 30 Sep 2016 13:12:08 -1000
Subject: RFR 8166929: [JVMCI] Expose decompile counts in MDO
In-Reply-To: <E3E70163-6221-4071-9E59-4055A2A358C0@oracle.com>
References: <54C73C2D-F44F-491E-92C3-79DE73CE7B8F@oracle.com>
	<CAHjP37HAuG9vkRN5JA0mgTUOXOOJxQLh-5qdnrNiBp-QAh+3Lg@mail.gmail.com>
	<0C3B6756-89C4-4CB2-BA71-A509C45FF82F@oracle.com>
	<0A743A77-9620-438C-8D04-C304744327AB@twitter.com>
	<E3E70163-6221-4071-9E59-4055A2A358C0@oracle.com>
Message-ID: <874B3DFE-DF6A-43E5-AEFE-1AAE5BA628B0@twitter.com>


> On Sep 30, 2016, at 1:04 PM, Tom Rodriguez <tom.rodriguez at oracle.com> wrote:
> 
>> 
>> On Sep 30, 2016, at 3:32 PM, Christian Thalinger <cthalinger at twitter.com <mailto:cthalinger at twitter.com>> wrote:
>> 
>> 
>>> On Sep 30, 2016, at 6:07 AM, Tom Rodriguez <tom.rodriguez at oracle.com <mailto:tom.rodriguez at oracle.com>> wrote:
>>> 
>>> 
>>>> On Sep 29, 2016, at 7:16 PM, Vitaly Davidovich <vitalyd at gmail.com <mailto:vitalyd at gmail.com>> wrote:
>>>> 
>>>> Quick fly-by comment: HotSpotMethodData::toString should use %d for overflow recompiles count printing, like the other counters.
>>> 
>>> Thanks, that was a typo.  Fixed and updated in place.
>> 
>> +  COMPILER2_OR_JVMCI_PRESENT(nonstatic_field(MethodCounters, _interpreter_invocation_count,   int))                                  \
>> +  COMPILER2_OR_JVMCI_PRESENT(nonstatic_field(MethodCounters, _interpreter_throwout_count,     u2))                                   \
>> Isn?t that true always?
> 
> Yes, it is here.  I copied those lines from the normal vmstructs database where it might not be true but they can be removed here.  While fixing this I also realized that InvocationCounter wasn?t declared in the JVMCI copy of vmstructs, so I added that.
> 
>> 
>> +    public int getDecompileCount() {
>> +        return UNSAFE.getInt(metaspaceMethodData + config.methodDataDecompiles);
>> +    }
>> +
>> +    public int getOverflowRecompileCount() {
>> +        return UNSAFE.getInt(metaspaceMethodData + config.methodDataOverflowRecompiles);
>> +    }
>> +
>> +    public int getOverflowTrapsCount() {
>> +        return UNSAFE.getInt(metaspaceMethodData + config.methodDataOverflowTraps);
>> +    }
>> This is high-level nitpicking:  the fields are plural but you named the methods singular except ?OverflowTraps?.  Either plural everywhere or none.
> 
> Yes I fixed that in 8 while preparing this webrev but I missed updating in it 9.  I?ve put the updated webrev at http://cr.openjdk.java.net/~never/8166929.1/webrev <http://cr.openjdk.java.net/~never/8166929.1/webrev>
Looks good.

Unrelated question:
         String nl = String.format("%n");
         String nlIndent = String.format("%n%38s", "");
+        sb.append("Raw method data for ");
+        sb.append(method.format("%H.%n(%p)"));
+        sb.append(":");
+        sb.append(nl);
+        sb.append(String.format("nof_decompiles(%d) nof_overflow_recompiles(%d) nof_overflow_traps(%d)%n",
+                        getDecompileCount(), getOverflowRecompileCount(), getOverflowTrapCount()));
Is pre-formatting nl really a win?  If yes, why are we not doing the same trick on the last line?

> 
> tom
> 
>> 
>>> 
>>> tom
>>> 
>>>> 
>>>> Thanks
>>>> 
>>>> On Thursday, September 29, 2016, Tom Rodriguez <tom.rodriguez at oracle.com <mailto:tom.rodriguez at oracle.com>> wrote:
>>>> http://cr.openjdk.java.net/~never/8166929/webrev <http://cr.openjdk.java.net/~never/8166929/webrev>
>>>> https://bugs.openjdk.java.net/browse/JDK-8166929 <https://bugs.openjdk.java.net/browse/JDK-8166929>
>>>> 
>>>> This is a minor API addition to expose some of the top-level MDO decompile and recompile counts.  It?s necessary to detect recompilation pathologies.  Tested by printing MDOs from JVMCI.  I also fixed a few problems I discovered with the formatting of the MDO printed form.
>>>> 
>>>> tom
>>>> 
>>>> 
>>>> -- 
>>>> Sent from my phone

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160930/7eaf19a9/attachment-0001.html>